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Preface  i 

I 

"Applied  Climatology"  can  be  defined  as  the  scientific  analysis  of  climatologi- 
cal data  in  the  light  of  its  useful  application  for  a practical  purpose.  This  tech- 
nical report  offers  the  AWS  meteorologist  a useful  guide  to  applied  climatological 
practices.  It  is  distinctly  oriented  towards,  but  not  limited  to,  military  applica- 
tions. It  explains  the  application  of  statistics  and  probability  to  problem-solving, 
and  provides  examples  of  methods  useful  in  solving  recurring  requests. 

Since  the  US  Air  Force  Environmental  Technical  Applications  Center  (USAFETAC) 
is  the  organization  designated  as  the  facility  of  Air  Weather  Service  for  providing 
climatic  services  to  military  agencies  and  other  authorized  organizations,  this 
technical  report  covers,  for  the  most  part,  techniques  and  methods  employed  by 
USAFETAC  personnel.  The  examples  selected  for  inclusion  allow  the  reader  to  follow 
the  practical  applications  of  probability  and  statistics  in  solving  some  of  the  mere 
frequent  requests  received  by  the  military  climatologist. 

! 

This  technical  report  is  a reprint  and  redesignation  of  AWSP  105-2,  1 November  j 

1968,  including  Change  1, dated  22  December  I969.  Chapters  1 and  2 of  that  document 
have  been  deleted  as  they  are  now  outdated.  No  other  changes,  either  editorial  or  I 

in  content,  have  been  made.  ! 
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Chapter  3 
PROBABILITY 


1.  Introduction. 

a.  The  applied  climatologist  Is  concerned  with  the  application  of  proba- 
bility theoi^  In  the  Interpretation  of  climatological  data.  The  concept  of 
probability  Involves  the  notion  of  prediction  of  the  number  of  times  an  attri- 
bute, A,  will  occur  In  a month.  In  a year,  or  In  any  Interval  of  time. 

b.  If  an  event  can  occur  In  N mutually  exclusive  and  equally  likely  ways 
and  If  f of  these  outcomes  have  an  attribute  A,  then  the  probability  (P)  of  A 
Is  the  fraction  f/N 

(1)  P(A) 

c.  The  concept  of  probability,  l.e.,  estimated  or  empirical  probability, 
as  a relative  frequency  Is  the  most  meaningful  In  applied  climatology.  As  the 
sample  size  Increases,  the  observed  relative  frequency  (probability)  approaches 
the  true  relative  frequency  In  the  universe. 

Example  1;  What  Is  the  probability  that  Washington,  D.  C.  will  have  a 
celling  < 1000  feet  and/or  visibility  < 3 miles  In  February?  The  an- 
swer depends  on  the  period  of  record  used  to  determine  the  probability. 
Thus,  If  we  used  only  one  year  (1950),  the  probability  would  be: 

P(A)  = H = ^ = 0.180 

For  five  years  (1950-1954)  P(A)  = ^ = -5^  = 0.091 

For  ten  years  (1950-1959)  P(A)  = ^ = = O.131 

Because  of  the  larger  sample  size,  the  probability  determined  from  ten 
years  of  data  Is  most  likely  nearer  the  true  probability  than  that 
given  by  the  data  for  one  or  five  years. 

d.  Basic  probability  theorems  are  shown  below.  In  addition  to  the  usual 
theorems,  the  relationship  between  the  odds  of  an  attribute  occurring  and  the 
probability  of  that  attribute  la  given  by  Equation  (3). 

. The  probability  of  a "certain"  attribute  is  1. 

. The  probability  of  an  "impossible"  attribute  Is  0. 
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. The  probability  of  the  attribute  A Is  0 s P(A)  s 1. 

. If  A and  a are  complementary  attributes,  then 

(2)  P(A)  = 1 - P(a) 

. The  odds  for  the  attribute  A are  m to  n.  If  and  only  If 

(3) 

Using  Equation  (3),  the  following  Odds  versus  Probability  of  attribute  A are 
obtained: 


Odds  for  A 

iLhl 

1 to  1 

1/2 

2 to  1 

2/3 

3 to  5 

3/8 

1 to  2 

1/3 

2.  Statistics  of  Attributes  (Set  Theory) . 

a.  Statistical  analysis  deals  with  quantitative  data  that  arise  In  two 
different  ways.  In  the  first  case,  the  observer  notes  the  occurrence  or  non- 
occurrence of  some  attribute  In  a series  of  observations.  The  methods  appli- 
cable to  this  type  are  referred  to  as  "statistics  of  attributes"  [55].  In  the 
second  case,  the  observer  notes  or  measures  the  actual  magnitude  of  some  vari- 
able. 'fhe  methods  applicable  to  these  cases  are  referred  to  as  "statistics  of 
variables . " 

b.  Attributes  are  divided  Into  two  distinct  classes.  Letters  A,  B,  and  C 
will  be  used  to  denote  the  several  attributes.  All  members  that  possess  at- 
tribute A,  B,  or  C will  be  termed  Class  A,  Class  B,  or  Class  C.  All  members 
not  possessing  attribute  A,  B,  or  C will  be  termed  Class  a,  b,  or  c (lower 
case  letters  mean  not  A,  not  B,  and  not  C). 

c.  The  nvimber  of  observations  assigned  to  a class  Is  called  the  class 
frequency.  Class  frequencies  of  attributes  are  designated  by  order  n\jmber  de- 
pending on  the  number  of  attribute  classes  included.  For  example,  AB,  aB,  bC 

are  classes  of  the  second  order;  ABc,  aBc,  AbC  are  classes  of  the  third  order. 

•a 

For  three  attributes  there  are  3 , or  27,  distinct  frequencies.  For  n attri- 
butes there  are  3*^  distinct  frequencies.  A class  frequency  can  be  expressed 
In  terms  of  class  frequencies  of  higher  order,  as 
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classes) . 


(ABC) 

225 

(aBC) 

675 

(ABc) 

225 

(aBc) 

900 

(AbC) 

600 

(abC) 

1350 

(Abc) 

375 

(abc) 

3150 

(A)  = (ABC)  + (ABc)  + (AbC)  + (Abe)  = 1425 

(B)  = (ABC)  + (ABc)  + (aBC)  + (aBc)  = 2025 

(AB)  = (ABC)  + (ABc)  = 450 

N = (ABC)  + (ABc)  + (AbC)  + (Abe)  + (aBC)  + (aBc)  + (abC) 

+ (abc)  = 7500 

The  complete  results  are : 

N = 7500  (B)  = 2025  (AB)  = 450  (BC)  =900 

(A)  = 1425  (C)  = 2850  (AC)  = 825  (ABC)  = 225 

e.  A fundamental  set  Is  a set  of  2*’  class  frequencies  that  are  well  de- 
fined and  distinct.  The  fundamental  set  specifies  the  whole  data.  The  posi- 

tive class  frequencies  are  a fundamental  set,  l.e.,  N,  (A),  (B),  (C),  (AB), 
(AC),  (BC),  (ABC).  The  following  example  shows  how,  if  given  the  positive 
class  frequencies  of  Example  2,  all  of  the  class  frequencies  can  be  determined. 

Example  3;  Given  a fundamental  set,  e.g.,  the  eight  positive  class 
frequencies  of  Exeimple  2,  find  all  the  class  frequencies. 

N = 7500;  (A)  = 1425;  (B)  = 2025;  (C)  = 285O; 

(AB)  = 45O;  (AC)  = 825;  (BC)  = 900;  (ABC)  = 225 

We  have: 

(AB)  = (ABC)  + (ABc)  Similarly  from  (AC)  and  (BC) 

450  = 225  + (ABc)  (AbC)  = 600 

(ABc)  = 225  (aBC)  = 675 

also 
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(abC)  = (bC)  - (AbC) 

= (C)  - (BC)  - (AbC) 

= 2850  - 900  - 600 
= 1350 

Finally 

(abc)  = (be)  - (Abe) 

= (e)  - (Be)  - (Abe) 

= N - (C)  - ((B)  - (BC))  - (Abe) 

= 7500  - 2850  - 1125  - 375 
= 3150 

f.  Class  frequeneles  that  have  been  or  might  have  been  observed  In  the 
same  population  may  be  said  to  be  eonslstent  with  one  another.  For  this  eon- 
slsteney  to  exist,  the  necessary  and  sufficient  condition  Is  that  no  ultimate 
class  frequency  be  negative  [55]. 


Similarly,  we  have 
(Abc)  = 375 
(aBc)  = 900 


This  condition  must  exist 


or  this  frequency 
will  be  negative 


Two  Attributes 

(AB)  S 0 (A3) 
(AB)  S (A)  + (B)  - N (ab) 
(AB)  S (A)  (Ab) 
(AB)  s (B)  (aB) 


Three  Attributes 

(ABC)  8 0 (ABC) 

8 (AB)  + (AC)  - (A)  (Abc) 

8 (AB)  + (BC)  - (B)  (aBc) 

8 (AC)  + (BC)  - (C)  (abC) 

s (AB)  (ABc) 

S (AC)  (AbC) 

s (BC)  (aBC) 

. s (AB)  + (AC)  + (BC)  - (A)  - (B)  - (C)  + N (abc) 
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g.  It  is  possible  to  draw  inferences  from  data  that  are  otherwise  insuf- 
ficient for  calculating  all  class  frequencies  by  use  of  the  limits  set  forth 
in  the  preceding  paragraph. 

Example  4;  Given:  Airfield  A,  open  80^  of  the  time; 

Airfield  B,  open  955^  of  the  time: 

Find  limits  to  the  percentage  of  time  A and  B are  simultaneously  open, 
l.e.,  (AB), 

(AB)  i 0 (A)  = 80 

(AB)  s 80  + 95  - 100  = 75  (B)  = 95 

So,  both  airfields  are  open  not  less  than  755^  or  more  than  805^  of  the 
time . 

h.  Attributes  may  be  independent  of  each  other,  completely  associated. 
partially  associated,  or  completely  disassociated. 

(1)  Independence.  If  the  occurrence  of  one  attribute  gives  no  infor- 
mation as  to  the  occurrence  of  the  other  attribute,  they  are  said  to  be  inde- 
pendent of  each  other.  Thus,  with  Independent  attributes  A and  B,  we  can  ex- 
pect the  same  proportion  of  A's  among  the  B's  as  among  the  not  B’s,  l.e., 
(AB)/(B)  = (Ab)/(b) . This  relationship  can  be  seen  in  the  2X2  table  below: 


TABLE  1 

Independent  Attributes. 


Attribute 

B 

1 

b 

Total 

A 

(AB) 

(Ab) 

(A) 

a 

(aB) 

(ab) 

(a) 

Total 

(B) 

(b) 

N 

The  proportion  of  A's  among  B's  is  the  SEune  as  that  in  the  universe,  l.e.. 


November  1968 


AWSP  105-2 


Equation  (4)  Is  the  fundamental  rule  for  Independence. 

(2)  Attributes  A and  B are  associated  If  they  appear  together  In  a 
greater  nvmiber  of  cases  than  Is  expected  of  Independent  attributes,  e.g., 

(AB)  »-  (A)(B)/W;  If  (AB)  < (A)(B)A.  A and  B are  disassociated.  For  complete 
association  (AB)  must  be  equal  to  (A)  or  (B),  whichever  Is  smaller.  For  com- 
plete dlsassoclatlon  (AB)  must  be  equal  to  zero  or  (A)  + (b)  - N,  whichever  Is 
greater. 

(3)  The  association  between  A and  B In  subuniverses,  defined  by  C and 
c,  are  called  partial  associations.  A and  B are  positively  associated  In  the 
population  C If  (ABC)  > (AC)(BC)/C,  and  negatively  associated  In  the  converse 
case . 

(4)  The  coefficient  of  complete  association  (a)  must  not  be  depend- 
ent upon  the  size  of  the  sample  or  freouency  of  the  attributes;  and  (b)  must 
be  convenient  to  use,  l.e.,  the  coefficient  Is  zero  when  Independent,  +1  when 
completely  associated,  and  -1  when  completely  disassociated.  A simple  coef- 
ficient that  satisfies  these  two  requirements  I5 


fab) 

- fAb] 

faB) 

,ab) 

+ (Ab) 

(aB) 

An  example  of  the  use  of  Equation  (5)  Is  shown  below: 

Example  5:  The  frequency  of  halos  and  subsequent  precipitation  was 
observed.  The  following  2x2  table  summarizes  the  observations: 

TABLE  2 

Summary  of  Halos  and  Subsequent  Precipitation. 
Precipitation  Within  48  Hours 


Yes  B 

No  b 

Total 

Halo  A 

497 

149 

646 

No  Halo  a 

819 

819 

1638 

Total 

1316 

968 

2284 

Using  Equation  (5) 

0 . ligUiSK) : tiitm 


(5)  For  partial  associations,  the  coefficient  of  association  C Is  de- 
fined as: 


1 
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^ (ABC)(abC)  - (AbC)(aBC) 

®AB.C  = fABCHabC)  + (AbCjrSSn 

(6)  An  example  of  the  use  of  the  coefficient  of  association  equations 
Is  shown  below: 

Exeunple  6;  Let  us  suppose  that  the  following  percentage  frequency  of 


attributes 

has 

been 

observed. 

C 

c 

Total 

AB 

12 

12 

24 

The  total  association  be- 

tween  A and  B 

from  Equation 

Ab 

28 

8 

36 

(5)  Is 

aB 

ab 

3 

7 

18 

12 

21 

19 

Total 

50 

50 

100 

= -0.25 

[■  This  Indicates  a negative  association  exists  between  A and  B In  the 

^ total  universe.  Now  we  consider  the  partial  association  between  A and 

f B In  subuniverse  of  C and  c by  Equation  (6) 

I 

I wMlRW- 

I «.B.o  - = » 

I This  Indicates  that  A and  B are  Independent  of  each  other  In  both 

j subuniverses. 

' 1.  Manifold  classification  [55]  ts  the  division  of  a population  according 

I to  an  attribute  A Into  a niimber  of  classes.  The  fundamental  principles  of 

I manifold  classification  are  similar  to,  but  more  .complicated  than,  those  for  a 

; 2X2  classification.  Manifold  classification  according 

I and  B gives  a contingency  table  where  the  classification 

I and  that  of  the  B's  t-fold.  There  will  be  s x t classes 

i represented  by  the  following  table: 
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TABLE  3 

Manifold  Contingency. 


Attribute 

^1 

^2 

®t 

Totals 

Ai 

(A^B^) 

(A^Bg) 

...  (A^B^) 

(A^) 

(AgB^) 

(AgBg) 

...  (AgB^) 

(Ag) 

(A^Bj) 

...  (AgB^) 

("s) 

Totals 

(B^) 

(B^) 

(B^) 

N 

J.  Association  In  a contingency  table  can  be  examined  by  reducing  It  In  a 
number  of  ways  to  a 2 x 2 table,  but  the  procedure  Is  very  long  and  tedious  If 
s and  t are  large.  In  practice,  we  usually  want  a coefficient  that  will  sum- 
marize the  general  nature  of  the  dependence.  Two  such  coefficients  are 
Pearson's  "coefficient  of  mean-square  contingency"  and  Tschuprow's  "coeffi- 
cient of  contingency."  These  coefficients  are  discussed  below: 

(l)  If  A's  and  B's  are  completely  independent  In  the  universe,  then 
for  all  values  of  I's  and  J's 

(7)  (A.Bj)  = (A,Bj)^ 

where  (Aj^Bj)^  Is  the  expected  frequency.  If  A and  B are  not  Independent,  then 
the  difference  d^j  equals: 


(8) 


djj  . (AjBj)  - 


We  define  the  quantity  X as  follows: 

,2 


(9) 


1 J ^ J 


2 

NOTE:  X must  be  calculated  from  the  actual  number  of  obser- 
vations (not  percentage  frequencies)  In  each  class  and 

the  total  number  of  observations  N must  be  known,  and  I 

the  expected  frequency  (Aj^Bj)^  should  be  at  least  five  j 

for  any  class.  j 
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i 

1 


Prom  the  above,  we  have  the  "coefficient  of  mean-square  contingency"  as  given 
by  Karl  Pearson: 


(10) 


0 . [-^f 

*-(N  + 


(2)  The  coefficient  shown  In  Equation  (lO)  has  one  serious  disadvan- 
tage, l.e..  It  never  reaches  1 as  a limit.  To  remedy  this,  Tschuprow  proposed 
the  coefficient  T defined  by: 


(11) 


T^  = 


N[(s-l)(t-l)]^ 


The  coefficient  of  Equation  (ll)  varies  between  0 and  1 when  s = t. 

Exeunple  7:  Are  the  following  field  conditions  Independent  of  the 
given  synoptic  hours? 

TABLE  4 

Summary  of  Field  Conditions. 


Hours 

Field  Conditions 

Totals 

Closed 

Inst 

Contact 

06 

55 

59 

218 

332 

12 

4 

33 

298 

335 

18 

7 

19 

307 

333 

Total 

66 

111 



823 



1000 

Degrees  of  freedom  (df) 

= (no.  of  rows  minus  l)  (no.  of  columns  minus  l) 
= (3-1) (3-1)  = 4 


Using  Equation  (9) 


1 J 


‘^l.l 


= 114.4 


Using  Equation  (lO) 


[_^]>^  . 0. 
'-(N  + X^)-l 


32 
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Using  Bqvtatlon  (ll) 


T 


N[(s-l)(t-l)]^-l 


where  s = t = 3 


0.24 


Hence,  field  conditions  are  significantly  dependent  on  synoptic 
hours  (06,  12,  l8)  and  not  likely  by  chance. 


k.  In  previous  subsections,  we  were  concerned  with  the  Intersection  or 
Joint  occurrence  of  attributes  A and  B,  l.e.,  the  occuiTence  of  both  A and  B. 
One  other  concept  Is  Important;  the 
union  of  attributes  A and  B,  which  Is 
the  occurrence  of  either  A or  B,  or 
both.  The  Intersection  of  A and  B Is 
denoted  as  AB;  the  union  of  A and  B 
Is  denoted  as  AUB.  These  two  con- 
cepts can  be  seen  In  Figure  2,  where 
the  Intersection  of  A and  B la  repre- 
sented by  the  shaded  area  and  the 
union  of  A and  B Is  shown  by  the  area 
enclosed  by  the  heavy  line.  The 
union  of  A and  B = Ab  + AB  + aB  = A + 

B - AB  = 1 - ab.  tributes  A and  B. 

l.  The  probability  that  at  least  one  of  two  attributes,  l.e.,  A and/or  B, 
occurs  Is  equal  to  the  sum  of  the  probabilities  of  each  event  minus  the  prob- 
ability that  both  events  occur  simultaneously, 

(12)  P(AUB)  = P(A)  + P(B)  - P(AB) 

Example  8;  An  attack  on  a coastal  Installation  Is  being  planned. 

Troops  and  equipment  can  be  delivered  to  the  area  by  air,  sea,  or 
both.  What  Is  the  probability  of  success  as  a result  of  weather 
effects?  Defining  A as  an  event  favorable  for  delivery  of  troops  by 
sea  and  B by  air,  climatological  Information  provides  the  following 
fundamental  set: 

P(A)  = .50  P(B)  = .60  P(AB)  = .25 


N=l 


Figure  2.  The  Intersection 
and  Union  of  At- 


The  probability  of  at  least  one  mode  favorable  Is 
P{AUB)  « .50  + .60  - .25  >•  .85 


m.  The  probability  that  at  least  one  of  two  attributes  occurs  Is  also 
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equal  to  one  minus  the  probability  that  neither  attribute  occurs,  l.e., 

(13)  P(AUB)  = 1 - P{ab) 


Example  9;  What  Is  the  probability  of  “ 1000  feet  celling  and/or  < 2 
miles  visibility  at  Harmon  In  January.  Prom  the  "D"  stmanaiTr  for 
Harmon,  the  probability  of  i 1000  feet  celling  and  * 2 miles  visibil- 
ity equals  0.83.  This  gives 

P(AUB)  = 1 - 0.83  = 0.17 

n.  If  attributes  A and  B are  mutually  exclusive,  l.e.,  P(AB)  = 0,  Equa- 
tion (12)  becomes 

(14)  P(AUB)  = P(A)  + P(B) 


Example  10;  What  Is  the  probability  that  Base  A and/or  Base  B are  be- 
low GCA  mlnlmums,  given  that  the  probability  of  Base  A being  below  Is 
0.05,  of  Base  B being  below  mlnlmvmi  Is  0.03,  and  of  both  Base  A and 
Base  B being  below  simultaneously  Is  zero?  Since  P(AB)  = 0,  Equation 
(l4)  Is  applicable. 

P(AUB)  = P(A)  + P(B)  = 0.05  + 0.03  = 0.08 


o.  The  definition  of  conditional  probability  can  be  stated  as  follows: 
The  probability  of  attribute  A,  given  that  attribute  B has  occurred.  Is  equal 
to  the  probability  of  both  A and  B divided  by  the  probability  of  B. 


(15)  P(A/B) 


Equation  (15)  Is  undefined  If  P(B)  = 0. 


Example  11 : Going  back  to  Example  8,  what  Is  the  probability  of  suc- 
cess by  sea,  given  that  air  delivery  Is  unfavorable. 


P(AA)  = 


P(A)  = P(AB)  + P(Ab)  = .50 
P(Ab)  = F(A)  - P(AB) 

P(Ab)  = .50  - .25  = .25 


P(AA)  = = .625 


P(b)  = 1 - P(B) 

P(b)  = 1 - .60  = .40 


So,  If  it  Is  known  that  air  delivery  Is  unfavorable. 


the  probability 
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of  success  by  sea  is  Increased  from  .50  to  .625. 

Example  12;  What  Is  the  probability  that  Base  A will  be  above  alter- 
nate mlnlmvnns  (celling  * 1000  feet  and  visibility  * 3 miles),  given 
that  Base  B Is  below  GCA  mlnlmums  (celling  < 200  feet  and/or  visi- 
bility •*  1/2  mile)?  Again  we  use  Equation  (15): 


Prom  a special  summary  that  gives  simultaneous  conditions  for  Bases 
A and  B,  P(AB)  = .05,  and  from  the  "D"  summary  for  Base  B,  P(B)  = .06 
Then 


P(A/B)  = ^ = .83 


This  Is  an  Important  and  often  useful  application  of  the  conditional 
probability  equation. 

p.  It  should  be  noted  that  all  general  probability  theorems  are  also 
valid  for  conditional  probabilities  with  respect  to  any  given  attribute, 
the  probability  of  occurrence  of  either  A or  B,  or  both,  given  attribute  i 
has  occurred,  we  have 

(16)  P(AUB/C)  = P(A/C)  + P(B/C)  - P(AB/C) 
or  In  a form  more  easily  used 

(17)  P(AUB/C)  . ’•U'SOl 

Example  13;  What  Is  the  probability  that  either  Base  A or  Base  B or 
both  will  be  above  alternate  mlnlmums  (ceiling  « 1000  and  visibility 
2 3 miles),  given  that  Base  C Is  below  GCA  mlnlmums  (celling  * 200 
feet  and/or  visibility  * 1/2  mile)?  From  a special  summary  that 
gives  simultaneous  conditions  for  Bases  A,  B,  and  C,  we  find 

P(AC)  = .04  P(BC)  = .07  P(ABC)  = .02 

And  from  the  "D"  summary  for  Base  C 

P(C)  = .10 


then 


P(AUB/C) 


.04  + .07  - .02 
.10 


.90 


For 
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q.  The  probability  of  the  attribute  A and  the  conditional  probability  of 

A,  given  the  attribute  B has  occurred,  are  generally  unequal,  but  they  can  be 
equal,  l.e., 

(18)  P(A/B)  - P(A) 

This  means  that  knowing  attribute  B has  occurred  does  not  change  the  probabil- 
ity of  the  attribute  A.  Therefore,  the  attribute  A Is  Independent  of  the  at- 
tribute B.  The  attribute  B must  have  positive  probability.  And  If 

P(A)  > 0 and  P(B)  *-  0 

we  can  rewrite  Equation  (15) 

(19)  P(AB)  - P(A)  P(B/A)  . P(B)  P(A/B) 

and  like  Equation  (I8) 

(20)  P(B/A)  = P(B) 

When  Equations  (18)  and  (20)  hold.  Equation  (19)  becomes 
[ (21)  P(AB)  = P(A)  P(B)  = P(B)  P(A) 

r.  Attributes  A and  B are  said  to  be  Independent  if,  and  only  If,  Equa- 
tion (21)  holds,  l.e.,  the  probability  that  both  A and  B occur  Is  the  product 
of  the  probability  that  A occurs  and  the  probability  that  B occurs.  If  the 
probability  of  the  attribute  A can  be  assumed  to  be  Independent  of  attribute 

B,  then  Equation  (21)  can  be  used  to  give  Joint  occurrence  of  A and  B. 

[ Example  14;  What  Is  the  probability  of  celling  5 200  feet  and  vlsl- 

; blllty  s 1/2  mile  at  Elmendorf  and  simultaneously,  a celling  of  s 1000 

; feet  and  visibility  s 2 miles  at  Elelson  In  January?  If  we  assume  the 

two  attributes  A (Elmendorf)  and  B (Elelson)  to  be  Independent,  the 
^ "D"  stjramarles  give  P(A)  = .96  and  P(b)  = .80,  so 

I P(AB)  = (.96)(.80)  = .77 

1 A special  simultaneous  summary  for  Elmendorf  and  Elelson  gave  P(AB)  = 

f .76,  so  the  assumption  of  Independence  was  a good  one.  Similarly, 

i computations  using  Equation  (2l)  can  be  extended  to  several  attributes; 


(22)  P(ABC)  - P(A)  P(B)  P(C) 
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Equation  (22)  holds  for  A,  B,  and  C,  as  In  Example  15  below,  no  conclusion  can 
be  drawn  about  P(ABc),  P(AbC),  ...  P(abc)  without  further  Infomatlon. 

Example  15:  Given; 

Base  A < 1000  feet  and/or  3 miles,  P(A)  = .19 

Base  B < 1000  feet  and/or  3 miles,  P(B)  = .27 

Base  C < 1000  feet  and/or  3 miles,  P(C)  = .38 

What  Is  the  probability  all  three  are  simultaneously  below  1000/3? 

If  we  assume  that  attributes  A,  B,  and  C are  Independent,  then 

P(ABC)  = (.19)(.27)(.38)  = .02 

A special  simultaneous  summary  gave  P(ABC)  = .03,  so  the  assumption 
of  Independence  was  quite  good. 


3.  Permutations  and  Combinations . 

a.  The  number  of  permutations  of  r attributes  selected  from  n given  dis- 
tinct attributes,  where  attention  Is  paid  to  the  order  In  which  the  attri- 
butes are  selected.  Is  given  by: 


(23) 


(n)j,  = P(n,r)  = 


W 


n; 


Example  16:  Let  (A,  B,  C,  D)  be  a set  of  four  air  bases.  How  many 
wind  factors  will  be  necessary  tc  cover  all  the  routes  between  each  of 
the  four  bases?  The  wind  factor  from  A to  B Is  different  than  that 
from  B to  A,  and  so  on  for  the  other  bases.  Consequently,  as  we 
select  two  of  the  air  bases  at  a time,  attention  must  be  paid  to  the 
order  In  which  they  are  selected.  This  gives: 


P(4,2) 


4: 

(4  - 2j: 


IL 

2J 


4 X 3 X 2 X 1 

2X1 


4x3 


12 


So  there  are  12  wind  factors  to  calculate.  The  12  pairs  or  12  permu- 
tations are: 


AB 

AC 

AD 

BC 

BD 

CD 

BA 

CA 

DA 

CB 

DB 

DC 

b.  The  required  number  of  combinations  of  r attributes  selected  from  n 
given  distinct  attributes,  where  the  order  In  which  the  attributes  are  se- 
lected Is  not  Important,  Is  given  by: 
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Example  17 : For  the  four  air  bases  In  Example  l6  let  ?(A),  P(B),  P(C), 
P(D)  be  the  probabilities  that  airfields  A,  B,  etc,,  are  closed.  What 
are  all  the  possible  combinations  of  the  air  bases  being  closed?  If 
we  select  zero,  one,  two,  three,  and  then  four  airfields  from  the  set 
of  four  and  pay  no  attention  to  the  order  of  selection,  we  form  all 
the  possible  combinations  as  shown  below: 


/ \ A'N  4 ' 

(a)  Zero,  I q j = oT4T  ^ ^ where  0!  = 1 

None  of  the  airfields  are  closed. 

P(abcd) 

(b)  One,  (5)  = ^ =.^^3^  = 4 
P(A),  P(B),  P(C),  P(D) 

(c)  TWO,  = 6 

P(AB),  P(AC),  P(AD),  P(BC),  P(BD),  P(CD) 

(d)  Three,  Q)  = ^ = = 4 

P(ABC),  P(ABD),  P(ACD),  P(BCD) 

(e)  Four,  (J)  = = 1 

P(ABCD) 


So  we  have  1+4+6+4+1  =l6  well-defined  and  distinct  probabili- 
ties that  will  specify  the  whole  data  of  four  attributes  A,  B,  C,  and 

D. 


4 . Binomial  Distribution. 

a.  The  binomial  distribution  can  be  used  to  find  probabilities  and  per- 
centiles when  only  the  mean  value  Is  known.  However,  It  can  be  used  only  when 
two  alternatives  or  outcomes  are  possible,  l.e.,  the  attribute  occurs  or  does 
not  occur  on  an  Individual  trial.  Also,  there  are  two  Important  assumption; 
associated  with  the  binomial  distribution  method:  (l)  the  trials  (N)  are  in- 
dependent, and  (2)  the  probability  (P)  remains  the  same  throughout  the  trials. 

b.  The  binomial  probability  function  for  a given  p and  n,  with  k as  the 
number  of  successes.  Is  given  by 

(25)  b(k;  n,o)  = p*^  k = 0,  1,  ...  n 

3-16 
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where  p Is  the  probability  of  occurrence  and  q = 1 - p Is  the  probability  of 
nonoccurrence.  If  we  denote  the  number  of  successes  k In  n trials  by  then 

(26)  b(k;  n,p)  = P(S^  = k)  = Q p*'  q""’' 

^ V\ 

Equation  (26)  represents  the  k term  of  the  binomial  expansion 

(27)  (q  + p)"  = b(0;  n,p)  + b(lj  n,p)  + b(2;  n,p)  + ...  + b(n;  n,p)  = 1 

Example  l8;  Washington,  D.  C.  can  expect  6.4  days  with  thunderstorms 
In  July,  l.e.,  p = 6.4/31  = 0.206.  If  we  assume  that  the  occurrence 
of  a thunderstorm  on  a given  day  Is  Independent  of  whether  a thunder- 
storm occurred  the  day  before,  what  is  the  probability  of  at  least 
four  days  with  thunderstorms  during  the  month  of  July?  The  probabil- 
ity Is: 


2(83^  a 4)  = 1 - 2(83^  s 3) 


P(S3i 

a 4) 

= 1 - [2(83^  = 

0)  + 2(833^  = 

1)  + P(8 

31  “ 2)  + P(S3j 

P(S3i 

= 0) 

= (31^  (.206)° 

(.794)31  ^ _ 

31: 

TTl-oTT 

(1)  (.794)31 

= antl-£n  (31  ^ 

■n  (.794)]  = 0 

0 

0 

0 

00 

P(S3i 

= 1) 

.(3;)  (.206)1 

(.794)30  ^ _ 

31: 

(31-1)1 

(.206)  (.794)30 

= 31  (.206)  [antl-^n  (30  in 

.794)]  = 

^ 0.0070 

P(S3i 

= 2) 

= (31)  (.206)2 

(.794)29 

= (.206)2  [antl-£n 

(29  U 

.794)]  = 0.0255 

P(S3;l  = 3)  = (^3)  (.206)3  (.794)^® 

= (.206)3  [antl-£n  (28  £„  .794)]  = O.O6I8 


The  probability  of  at  least  four  days  with  thunderstorms  Is  one  minus 
the  sum  of  these  four  probabilities. 

2(83^^  a 4)  = 1.0000  - (0.0008  + 0.0070  + 0.0255  + 0.0618) 

= 1.0000  - 0,0950  = 0.905 


The  probability  of  four  or  more  days  with  thunderstorms,  based  on  the 
16  observed  cases  In  the  past  19  years,  la  O.85.  This  difference. 


3-17 


AWSP  105-2 


November  1968 


1 


0.905  - 0.850  = 0.055  or  5-5/^i  probably  is  due  in  part  to  the  persis- 
tence factor  that  will  be  discussed  in  the  next  paragraph. 

' c.  The  binomial  probability  function  may  be  difficult  to  calculate  unless 

binomial  tables  are  available,  but  if  N is  larger  than  25  and  p is  between 
0.25  and  0.75j  the  binomial  distribution  may  be  approximated  by  the  normal 
distribution.  If  the  normal  approximation  is  used,  probabilities  are  easily 
determined  from  probability  paper  (Figure  3)  or  normal  tables  using  the  mean 
and  standard  deviation  of  the  binomial  distribution.  The  mean  of  the  binomial 
* distribution  is: 


(28)  = Np 

where  N is  the  number  of  Independent  trials,  and  p is  the  probability  of  oc- 
currence of  the  attribute . The  standard  deviation  of  the  binomial  distribu- 
tion is: 

(29) 

Where,  as  before,  p is  the  probability  of  occurrence  and  q = 1 - p is  the 

probability  of  nonoccurrence  of  the  attributes.  Meteorological  phenomena  tend 

to  have  a persistence  effect,  e.g.,  if  a phenomenon  (thunderstorm,  rain,  fog, 

etc.)  has  occurred  on  a given  day,  it  is  more  likely  to  occur  the  next  day 

than  if  it  had  not  occurred  on  the  given  day.  Consequently,  the  observed 

standard  deviation  usually  will  be  somewhat  higher  than  that  calculated  from 

Equation  (29).  This  difference,  the  observed  minus  the  binomial  standard 

deviation,  was  not  large  enough  to  affect  significantly  the  probabilities  of 

the  attributes  considered  in  Example  I8.  However,  Brooks  and  Carruthers  [11] 

2 

gives  persistence  factors  for  various  phenomena,  which  when  multiplied  by 
may  give  a better  estimate  of  the  standard  deviation,  l.e.. 


Selected  factors  from  the  referenced  text  are: 

Phenomenon 
Thunder 
Rain 
Fog 
Snow 

Example  19:  Colorado  Springs,  Colorado  can  expect  14.6  days  with 
thunderstorms  during  the  month  of  August.  What  is  the  probability  of 
at  least  I8  days  with  thunderstorms?  If  we  assume  Independence,  the 
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Figure  3.  The  Normal  Approximation  with  M = 14.6  and  P = 0.470 
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probabilities  can  be  found  using  Equations  (26)  and  (27). 

s i8)  = P(S3j  = 18)  + P(S2j  = 19)  + ...  + P(S2j  = 31) 

This  would  be  very  long  and  difficult  to  calculate,  but  since  P = 
1^.6/31  = 0.470  and  N = 31j  the  normal  approximation  can  be  used  to 
determine  the  probability.  The  mean  was  given  as  l4.6  days  and  the 
standard  deviation  Is  calculated  using  Equation  (29): 

=nA31  X 14.6/31  X 16,4/31  = >/7.23  = 2.69 

The  observed  standard  deviation  of  the  August  thunderstorm  days  In 
Colorado  Springs  for  the  past  15  years  was  2.59.  On  probability  pa- 
per (Figure  3)j  the  mean  l4.6  Is  plotted  at  505^  and  one  standard  de- 
viation l4.6  + 2.69  = 17.29  at  15.9^  on  the  upper  scale.  The  straight 
line  through  these  points  gives  the  normal  approximation.  In  addi- 
tion, the  observed  days  per  month  for  the  past  I5  years  were  ranked 
(using  the  formula  P = m/(l+n)  where  m = 1,  2,  ...  I5,  and  n = 15)  and 
plotted  and  the  agreement  Is  quite  good.  It  must  be  remembered  that 
the  normal  distribution  Is  continuous  and  the  binomial  distribution 
gives  probabilities  at  discrete  points  only;  therefore,  we  consider 
the  points  of  the  binomial  distribution  as  midpoints  of  class  Inter- 
vals. This  means  that  for  P ^ I8  we  read  the  value  at  the  17.5 
point,  l.e.,  we  read  the  percent  on  the  upper  scale  above  the  point 
where  the  normal  approximation  line  intersects  the  17.5  days  with 
thunderstorms  line.  So  P 5 I8,  the  estimated  probability  of  at  least 
18  days  with  thunderstorms.  Is  14^, which  agrees  well  with  the  observed 
probability  of  13.3^  over  the  past  15  years.  Other  estimated  and  ob- 
served probabilities  are: 


Days  per  Month 

a 10 

g 12 

* 14 

a 16 

s 18 

s 20 

Estimated 

97.2 

88.0 

64.0 

38.0 

14.0 

3.4 

Observed 

100.0 

80,0 

67.7 

33.3 

13.3 

0.0 

Example  20 : Washington,  D.  C.  can  expect  12  days  with  0.01  Inch  or 
more  precipitation  during  the  month  of  March.  What  Is  the  probability 
of  no  more  than  6,  8,  10,  12,  14,  or  I6  days  with  0.01  Inch  or  more 
precipitation?  The  normal  approximation  can  be  used  again  because  p = 
12/31  = 0.387  and  N = 31.  The  mean  Is  12  and  the  estimated  standard 
deviation  Is: 

>/31  X 12/31  X 19/31  =Nr7“  55  =2.71 

? The  mean  and  standard  deviation  are  plotted  on  Figure  4 along  with  the 
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Figure  4.  The  Normal  Approximation  with  M = 12.0  and  P = 0.38? 
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19  observed  cases.  The  straight  line  gives  the  normal  approximation. 
In  this  case,  we  read  the  probability  of  no  more  than  six  days,  l.e., 
(P  « 6)  at  the  6.5  point,  (P  s 8)  at  the  8.5  point,  etc.,  and  on  the 
lower  scale.  Estimated  and  observed  probabilities  are: 


Days  per  Month 

s 6 

5 8 

5 10 

s 12 

s 14 

s 16 

Estimated 

1.7 

8.8 

28.0 

58.0 

83.0 

95.8 

Observed 

0.0 

10.0 

25,0 

42.5 

80.0 

95.0 

d.  The  binomial  distribution  also  can  be  approximated  by  the  Poisson  dis- 
tribution, When  N Is  large,  p Is  small,  and  the  mean  Is  constant  and  equal  to 
Np,  the  binomial  distribution  Is  given  approximately  by  the  Poisson  distribu- 
tion. The  probability  of  0,  1,  2,  . . . , k unlikely  successes  In  N trials  Is 
approximated  by: 

(30)  P {-K,  k)  = 

where  X Is  the  mean  and  equal  to  Np.  This  formula  Is  usually  much  easier  to 
use  than  Equation  (26). 


Example  21 : During  the  month  of  February,  Washington,  D.  C.  can  expect 
one  day  with  one  Inch  or  more  snowfall  (p  = l/28,  N = 28,  and  Np  = 

A = l).  What  Is  the  probability  of  zero,  one,  two,  or  three  days  In 
February  with  one  Inch  or  more  snowfall?  We  use  Equation  (30)  and  let 
k take  values  of  0,  1,  2,  and  3 for  probabilities  of; 


Zero  Days: 

P(1,0)  = ^-ot  - = 


One  Day: 

P(l,l)  = - = 


Poisson 

0.368 


0.368 


Observed 

0.316 


0.263 


Two  Days: 

p(i,« 


Three  Days 

Pfl,3)  = ^ 


1^  e“^  0^68 


0.184 


0.06l 


0.318 


0.053 


The  probability  of  at  least  three  days  with  one  Inch  or  more  snowfall 
Is; 
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P(S28  i 3)  = 1 - [PCSgQ  = 0)  + PiSgg  = 1)  + P(S28  - 2)] 

= 1 - (0.368  + 0.368  + 0.184)  = 1 - 0.920  = 0.08 


In  the  past  19  years,  three  or  more  days  with  one  Inch  or  more  snow- 
fall has  occurred  twice,  l.e.,  p = 2/19  = 0.105,  which  agrees  well 
with  the  Poisson  approximated  value  of  O.08. 
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Chapter  4 
STATISTICS 


1 . Introduction. 

a.  Statistics,  as  applied  to  meteorology.  Involves  the  analysis  of  past 
weather  data  and  allows  the  analyst  to  draw  conclusions  about  the  behavior  of 
similar  weather  data  In  the  future.  For  short-range  forecasting,  the  meteor- 
ologist relies  heavily  on  a review  of  conditions  Immediately  preceding  the 
forecast  period.  On  the  other  hand,  the  meteorological  statistician  summa- 
rizes past  data  according  to  time  and  space  without  reference  to  Immediate 
conditions.  Such  summaries  have  their  greatest  utility  In  long-range  forecast- 
ing. 

b.  In  meteorology,  statistical  methods  are  used  to  obtain,  analyze,  and 
present  numerical  weather  data.  Methods  range  from  the  most  elementary  de- 
scriptive devices  to  extremely  complicated  mathematical  procedures.  The  mete- 
orological statistician  analyzes  a series  of  observations  In  such  a way  as  to 
get  the  best  Information  from  them,  and  applies  the  results  of  his  analyses. to 
specific  questions  about  climate. 

2.  Frequency  Distributions. 

a.  The  meteorological  observation  network  generates  an  enormous  quantity 
of  data  each  year.  Before  the  analyst  can  begin  to  study  these  data,  he  must 
organize  them  In  a manner  that  readily  points  out  their  major  features  and 
simplifies  future  computations.  A method  that  Is  frequently  used  by  the  ana- 
lyst Is  to  sort  observed  data  into  separate  classes.  The  numerical  width  of 
each  class  Is  called  the  class  Ince.val  and  Is  the  difference  between  the  up- 
per mathematical  limit  of  the  class  and  the  lower  mathematical  limit  of  the 
class.  The  midpoint  of  the  class  Interval  Is  defined  as  the  class  mark.  The 
class  frequency  Is  the  number  of  observatlor  ; falling  In  the  class  Interval. 

The  class  percentage  frequency  Is  defined  as  follows: 

Class  Frequency 

Total  Number  of  Observations 

It  Is  desirable,  but  not  mandatory,  that  all  classes  have  the  same  class  In- 
terval. A rule  of  thumb  for  determining  the  number  of  classes  Into  which 
these  data  should  be  divided  Is  to  use  (5  classes  where  N Is  the 

total  number  of  observations.  Table  5 Is  a summary  of  mean  monthly  tempera- 
tures for  Washington  National  Airport;  It  will  be  used  In  examples  throughout 
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this  chapter.  A tabular  display  of  data  sorted  Into  classes  Is  called  a fre- 
quency distribution;  Table  6 Is  such  a display  of  the  data  shown  In  Table  5. 

TABLE  5 

Mean  Monthly  Temperature  In  January  at 
Washington  National  Airport. 


Year 

Temp  ('F) 

Rank 

Temp  ("F) 

19't3 

36 

1 

29 

19^'* 

37 

2 

30 

1945 

31 

3.5 

31 

1946 

37 

3.5 

31 

1947 

42 

5 

33 

1948 

29 

6 

34 

1949 

43 

8.5 

35 

1950 

48 

8.5 

35 

1951 

39 

8.5 

35 

1952 

- 1 

41 

8.5 

35 

Year  Temp  (*?) 


Rank  Temp  ( *F ) 


TABLE  6 

Frequency  Distribution. 


Class 

Class 

Mark 

Lower 

Math. 

Limit 

Upper 

Math. 

Limit 

Class 
Freq . 

Percent 

Freq. 

29-31 

30 

28.6 

31.5 

4 

19.0 

32-34 

33 

31.6 

34.5 

2 

9.6 

35-37 

36 

34.6 

37.5 

8 

38.0 

38-40 

39 

37.6 

40.5 

2 

9.6 

41-43 

42 

40.6 

43.5 

4 

19.0 

44-46 

45 

43.6 

46.5 

0 

0.0 

47-49 

48 

46.6 

49.5 

1 

4.8 

z 

— 

— 

— 

21 

100.0 

No.  of 

' Classes  = 5 log^^Q  21 

= 5 X 

1.322  - 1 

Observed  Range  = 48  - 29  = 19®F 
Class  Interval  = 19  + 7 = 2.7  =<  3‘'F 

b.  Often  the  analyst  Is  Interested  In  the  frequency  (or  percentage  fre- 
quency) of  observations  with  numerical  value  less  than  or  equal  to  a specific 
value.  The  frequency  distribution  may  be  readily  converted  Into  a cumulative 
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frequency  distribution  by  summing  the  frequencies  from  the  lowest  value  to  the 
highest  value.  Table  7 Is  a cumulative  frequency  distribution  of  the  frequen- 
cies shown  In  Table  6.  A sample  reading  from  Table  7 shows  that  28. 65^  of  the 
time  the  mean  temperature  Is  less  than  or  equal  to  3^.5*P. 

TABLE  7 

Cumulative  Frequency  Distribution. 


Class 

Upper 

Math. 

Limit 

— 

Class 

Freq. 

Cumul . 
Freq. 

Percent 

Freq. 

Cumul . 
Percent 
Freq. 

29-31 

31.5 

4 

4 

19.0 

19.0 

32-3^ 

34.5 

2 

6 

9.6 

28.6 

35-37 

37.5 

8 

14 

38.0 

66.6 

38-40 

40.5 

2 

16 

9.6 

76.2 

41-43 

43.5 

4 

20 

19.0 

95.2 

44-46 

46.5 

0 

20 

0.0 

95.2 

47-49 

49.5 

1 

21 

4.8 

100.0 

c.  A graphical  presentation  of  a frequency  distribution  Is  called  a his- 
togram. An  Important  feature  of  a histogram  Is  that  each  obsei*vatlon  Is  rep- 
resented by  a unit  of  area.  The  area  representing  each  class  Is  directly 
proportional  to  the  class  frequency.  Figure  5 Is  a histogram  of  the  frequency 
distribution  of  Table  6. 


Figure  5.  Histogram.  When  class 
Intervals  are  equal, 
the  height  of  the  bar 
Is  equal  to  the  class 
frequency. 
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d.  A graphical  presentation  of  the  cumulative  frequency  distribution  Is 
called  an  ogive . The  ordinate  of  an  ogive  Is  equal  to  the  frequency  (or  per- 
cent frequency)  of  observations  having  values  less  than  or  equal  to  a speci- 
fied value.  This  means  the  ordinate  Is  proportional  to  the  area  of  a histo- 
gram located  to  the  left  of  the  abscissa  value.  Figure  6 Is  an  ogive  of  the 
cumulative  frequencies  of  Table  7. 


Figure  6.  Ogive.  The  ordinate  Is  plotted  at 

the  upper  mathematical  limit  of  each 
class  Interval. 

e.  An  approach  frequently  used  In  applied  climatology  Is  to  determine  a 
value  of  the  variable  that  Is  exceeded  with  a frequency  of  0 percent.  In 
order  to  determine  this  value  of  the  variable,  we  turn  to  the  ogive  or  cumula- 
tive frequency  distribution.  Let  us  define  the  Kth  percentile  as  the  value  of 
the  variable  that  has  K percent  of  the  observations  less  than  or  equal  to  the 
specified  value.  The  magnitude  of  the  variable  exceeded  with  a frequency  of 
0 percent  is  then  the  (l-Q)th  percentile. 

Example  22;  What  Is  the  90th  percentile  temperature  of  the  data  In 

Table  5? 

Method  A;  Using  the  ogive  In  Figure  6,  the  answer  Is  found  to 
be  42.6®F  (see  dashed  lines). 
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r 


Method  B;  RefeiTing  to  the  cumulative  frequency  distribution  of 
Table  7»  use  linear  Interpolation  i the  approximate  class 

Interval  (in  this  example  the  4l-43*P  Interval). 


43.5  - 40.5  95. g - 76.2  19.0 
43.5  - P90  ' 95. g - 90  ■ 5.g 


3.66 


43.5 


90 


^414-40^  , 


C65- 


0.82 


PgQ  = 43.5  - 0.82  = 42.7“P 


The  90th  percentile  (Pqq)  temperature  Is  42.7®P.  This  result  can  be 
restated  to  Indicate  that  105^  of  the  observations  will  have  a tempera- 
ture In  excess  of  42.7*P. 

3.  Measures  of  Central  Value. 

a.  Central  value  refers  to  the  location  of  the  center  of  the  distribution 
of  observations.  There  are  three  of  these  measures:  the  median,  the  mode, 
and  the  mean. 


b.  The  median  Is  the  middle  value  of  the  observations  when  they  are  ar- 
ranged In  ranked  order  (see  Table  5).  A look  at  Table  5 shows  that  the  median 
Is  36“P.  The  median  Is  also  the  50th  percentile  value.  The  ogive  In  Plgure  6 
shows  the  50th  percentile  (median)  Is  36.15*P.  Note  the  difference  between 
the  median  as  obtained  from  Table  5 and  from  the  ogive.  The  difference  occurs 
because  In  the  ranked-order  table  the  temperature  is  recorded  in  discrete 
values,  whereas  In  the  ogive  we  have  treated  temperature  as  a continuous 
variable . 


c.  The  mode  of  a frequency  distribution  Is  that  value  of  the  variable  for 
which  the  frequency  Is  a maximum.  A mode  then  Is  also  the  most  probable  value 
of  the  variable.  There  may  be  one  or  more  modes  to  a frequency  distribution. 

A look  at  the  frequency  distribution  In  Table  6,  or  the  histogram  of  Plgure  5, 
shows  the  following  modes  (Table  8) : 

TABLE  8 


Modes . 


Temperature 

Prequency 

Type  Mode 

30 ‘P 

4 

Secondary 

36“P 

8 

Primary 

42®P 

4 

Secondary 

48®P 

1 

Tertiary 
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d.  The  third  and  most  useful  of  all  measures  of  central  value  Is  the 
arithmetic  mean  or  the  mean.  The  valioe  of  the  mean  lies  In  Its  use  In  the 
mathematical  theory  of  statistics.  Assume  a discrete  variate  X may  assume  the 
values  Xg,  x^,  ....  Xj^,  where  each  value  has  the  coirespondlng  probability 
Pgt  Py  ***'  ’^N*  define  the  expected  value  of  X as; 


n 

E(X)  = ^2^2  ^ ^3^3  ^^N^N  " ^ ^1*1 


1=1 


The  mean  Is  defined  as  the  expected  value  of  X and  Is  shown  as  E(X)  or  7. 
This  Is  the  probabilistic  definition  of  the  mean.  Simple  manipulation  can 
soon  transform  E{X)  Into  the  well-known  layman's  definition  of  the  mean. 
Given: 


N 


!(X)  - ^ 


1=1 


Prom  probability  theory,  we  note: 
f. 


Pi  = 


N 

S f. 


1=1 


then 


N 


K(X,  . X 

1=1  2 f. 


1=1 


but 


N 

I 

1=1 


the  number  of  observations,  so: 
N 


E(X)  - J ^ 


1=1 


If  we  assume  f = 1 for  all  Individual  Xj^'s,  we  find 

N 


E(X)  = u (xj^  + Xg  + . . . + = u ^ 


1-1 


1 
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The  last  equation  Is  the  layman's  definition  of  the  mean  or  average.  Let 
Y « aX  + b,  then  It  may  be  shown  that  E(y)  •=  E(aX  + b)  = a E(x)  + b.  This  Im- 
plies that  If  a constant  Is  added  (or  subtracted)  to  all  values  of  X,  the  mean 
of  the  new  variable  is  the  constant  plus  (or  minus)  the  mean  of  X.  Further, 

If  all  values  of  X are  multiplied  by  a constant,  the  mean  of  the  new  variable 
Is  equal  to  the  product  of  the  constant  and  the  mean  of  X.  Therefore,  there 
are  three  procedures  available  for  computing  the  mean: 


(32) 

(33) 


1=1 


(xj^  - b) 


X = b + 


N 

2 f (x  - b) 
1=1  ^ ^ 

N 


NOTE:  When  the  observations  are  grouped  In  classes,  the  x^^'s  of  Equations 
(31)  and  (33)  are  the  class  marks  of  each  class. 

Example  23:  Given  the  data  In  Table  9,  compute  the  mean  using  Equa- 
tions (31),  (32),  and  (33). 

Using  Equation  (31): 

N 

X = i ^ Vl  = 2T  766  = 36.5 
1=1 

Using  Equation  (32): 


N 

X = i y x^  ^ X 766  = 36.5 
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Using  Equation  (31)5 
N 

if  = J X 768  = 36.6“F 

1=1 

NOTE;  The  difference  between  36.6®F  obtained  In  this  example  and 

36.5“E  found  In  Example  23  Is  the  result  of  all  observations 
not  falling  on  class  marks  of  the  different  classes. 


4 . Measures  of  Dispersion. 

a.  The  mean  by  Itself  does  not  provide  a clear  picture  of  the  distribu- 
tion. An  Important  feature  of  a distribution  Is  the  extent  to  which  the  ob- 
served values  spread  out  from  the  mean,  l.e.,  the  dispersion.  There  are  three 
measures  of  dispersion; 

(1)  Range . the  difference  between  the  largest  and  the  smallest  ob- 
served values, 

(2)  Mean  Deviation,  the  average  of  the  absolute  value  of  the  devia- 
tions of  the  observed  values  from  the  mean,  and 

(3)  Standard  Deviation,  the  measure  of  the  dispersion  of  the  observed 
values  about  their  arithmetic  mean. 


b.  Returning  to  the  concept  of  expectation,  variance  Is  defined  as 
E [ (X  - X)^]  and  denoted  as  "s^, " where  "s"  Is  the  standard  deviation  of  the 
sample.  Simple  manipulation  of  the  defining  equation  of  the  variance  yields 
more-workable  forms  of  the  equation; 


(34)  s^  = Var  X = E [ (X  - X)^]  = J - X)^ 


1=1 


also 


E [(X  - 7)"^]  = E (X"^  - 2X  X + X^)  = E (X^)  - 2X  E(X)  + X 
= E (X^)  - 2X^  + X^  = E (X^)  - X^ 


^5^2 


but 


E (X=)  - Y,  ^ 


1=1 


and 


X2  = 


1=1  ^ ^ r 

J 


so 
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f!.  f,. r.^,  'Vi' 


1=1 


1=1 


'12 

'J 


(35)  3^  =1{  X (^l^l^r  j 


1=1 


1=1 


If  = 1 for  all  Individual  I's,  then 


(36)  ’^1^-  J(I  Xl/. 


1=1 


1=1 


Letting  Y = aX  + b,  we  have  Y = aX  + b and  can  determine  the  variance  of  Y, 
Var  (Y): 

Var  (Y)  = E [ (Y  - Y)^]  =E  [(aX+b-aX-  b)^]  = E [a^  (X  - X)^] 

= a^  E [ (X  - X)^]  = a^  Var  (x) 

Thus,  adding  (or  subtracting)  a constant  to  X does  not  change  the  variance  of 
X.  However,  multiplying  X by  a constant  multiplies  the  variance  of  X by  the 
constant  squared.  For  example,  consider  Equation  (35)! 


1=1 

N 


(37) 

1=1 


=HI  "^1  -i[l  '■1  (^i-"")]  ] 


1=1 


1=1 


If  Equations  (34),  (35),  (36),  and  (37)  are  multiplied  by  N/N-1,  the  resulting 
equations  give  unbiased  estimates  of  the  population  variance.  For  computation- 
al purposes,  any  of  the  following  equations  can  be  used  to  find  the  variance 
and  standard  deviation  of  our  sample : 


(38) 


- [ I »!  - 


1=1 

N 


(39)  = N^  [I  ^1  (’^l  - 


1=1 
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(“0)  -A[T  -hCY.  ^iT\ 

1=1  1=1 

<*i'  - iCT  [ I vi^  - KZ  vOT 

1=1  1=1 

(42)  ^ (X^  - 4)2  . 1 [ ^ (Xj  - 4)Z  ] 

i=l  i=l 

(43)  s2  = ^ f,  (X,  - A)2  . 1 [ (X^  - A)]^  ] 

1=1  1=1 


Example  25;  Given  the  data  In  Table  11  and  the  results  of  Example  24, 
compute  the  variance  and  the  standard  deviation  using  Equations  (38), 
(41),  and  (43). 


N 

NOTE;  Example  23  gives  J = 36.5  and  ^ = 766 

1=1 


Using  Equation  (38): 

N 


= N^  £ (X^  - X)2  = ^ [441,25]  = 22. 1“F 


1=1 

Using  Equation  (4l): 

N 


1=1 


N 2-1 

- in  [ I - HI  hh)  ] ■ 28.382  - ^ (766)2j 

■ ’ 1=1 


- 20 


382 


586.756 

21 


- 20  (441. 


24]  = 22.1®F 


Using  Equation  (43) 
N 


= N^i£  f"!  - A)^  - W [Z  ^1  (^1  - A)]  } 

1=1  1=1 

= 2^  f 87  - ^ (31]2]  = ^ [487  - 5|l]  = ^ [441.24]  = 22.1-F 
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The  standard  deviation  is 


= +>/22.1  = 4.7 


Note  that  2 (X.  - 5f),  the  sum  of  the  deviations  about  th''  mean,  should 

1=1  ^ 

be  zero.  This  can  be  proved  as  follows: 

N N N 

£ (X^  - X)  = ^ - N X but  N X E ^ Xj^ 


therefore 


I (’<1  - *)  - 1 *1  - E *1 ' « 

1=1  i=i  1=1 

However,  the  sum  in  column  5 of  Table  11  does  not  equal  zero  because 
the  mean  of  X is  rounded  off  to  l/10*F. 

Example  26:  Compute  the  variance  and  standard  deviation  from  grouped 
data  using  the  frequency  distribution  of  Table  12  and  Equations  (4l) 
and  (43). 

Using  Equation  (4l): 

■ in  [ I - KI  ■ 55  ■ 51 

1=1  i=l 

= [28,548  - (589,824)]  = 2^  [461]  = 23.05  = 23.1*F 

Using  Equation  (43): 

s^  = - A)  - ^ ^ j 

1=1  i=l 

= W [585  - ^ (-51)^]  = ^ [585  - ^ (2601)] 

= [461]  = 23.05  = 23.1*F 


Therefore 


+ v'^TT"  = 4.8 


J 
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c.  Differences  in  computed  values  of  the  variance  and  the  standard  devia- 
tion, as  given  In  Exeimples  25  and  26,  are  due  to  the  fact  that.  In  Example  26, 
grouped  data  were  used  and  all  observations  do  not  lie  exactly  on  the  class 
mark  of  each  class  Interval.  The  use  of  grouped  data  always  produces  an  over- 
estimate of  the  standard  deviation.  This  may  be  compensated  for  by  Sheppard's 
collection. 

fp  i?" 

(44)  s^  = +^Ja^  - 
o 

where  s is  the  variance  and  1 is  the  class  Interval. 

Exsimple  27;  The  standard  deviation  given  in  Example  26  Is  corrected 
as  follows: 

Sg  = +\23.1  - =«/23.1  - 0.75  =«/?5.35  - 

d.  A standardized,  or  often  referred  to  as  normalized,  variable  can  be 
defined  as  a variable  with  a mean  of  zero  and  a standard  deviation  of  one. 

Such  a variable  is: 

(45)  z = 

where 


To  show  that  E(Z)  = 0 and  Var  (Z)  = 1 

7 = E(Z)  = E J E [X  - y]  = I (0)  = 0 

Var(Z)  . E - z)®]  . E - o)^]  - E 

= -|e  [(X  - 5f)2]  = 4 = 1 

s s 

5.  Universe  and  Sample. 

a.  The  universe  is  defined  as  all  measurements  of  a meteorological  param- 
eter that  have  or  could  have  been  taken  and  can  or  will  be  taken  at  a particu- 
lar geographical  location  (or  locations).  The  sample  is  defined  as  all 
measurements  of  the  meteorological  parameter  that  have  been  taken  and  are 
available  to  the  analyst  for  study.  The  sample  Is,  therefore,  a restricted 
subset  of  the  universe. 
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b.  There  are  three  statistical  distributions  connected  with  the  concept 


of  universe  and  sample;  the  distribution  of  the  universe,  the  distribution  of 
the  sample,  and  the  distribution  of  the  sample  statistics  of  all  possible  sam- 
ples of  Sample  Size  N.  Also,  most  statistical  procedures  are  based  on  two 
assumptions:  the  universe  (or  population)  Is  Infinite,  and  the  Individual 
measurements  In  the  s£ifflple  are  chosen  In  a random.  Independent  manner.  While 
the  universe  of  all  meteorological  parameters  Is  Infinite,  the  sample  size  of 
many  parameters  may  be  quite  small.  The  accepted  practice  In  most  climato- 
logical studies  Is  to  use  a time  series  of  observations,  l.e.,  all  consecu- 
tive values  available  are  used.  This  practice.  In  many  cases,  violates  the 
second  assumption  noted  above,  for  such  observations  may  be  autocorrelated. 

c.  Autocorrelation  coefficients  are  ordinal^  correlation  coefficients 
within  a time  series  In  which  the  correlated  values  are  a constant  Interval 
apart.  The  autocorrelation  for  any  lag  t can  be  estimated  from  the  sample  by: 


An  autocorrelation  different  from  zero  means  that  observations  are  not  Inde- 
pendent and  that  some  statistical  tests  In  common  use  are  not  valid  for  time- 
series  data.  However,  an  autocorrelation  of  zero  does  not  necessarily  imply 
that  the  observations  are  Independent.  It  Is  important  that  the  analyst  be 
aware  of  the  problems  of  using  an  autocorrelated  time  series.  Special  methods 
have  been  proposed  to  take  Into  consideration  the  fact  that  a time  series  may 
not  be  an  independent  random  Scimple . 

d.  In  general,  the  mean  of  a sample  that  consists  of  Independent  random 
observations  Is  a good  estimate  of  the  mean  of  the  universe . The  variance  of 
this  same  sample  (when  computed  using  formulas  given  In  paragraph  4 of  this 
chapter)  Is  a good  estimate  of  the  variance  of  the  universe.  Paragraph  7 of 
this  chapter  provides  a method  of  determining  the  confidence  limits  of  the 
sample  mean  and  variance  for  either  Independent  or  autocorrelated  data. 

e.  The  design  of  an  experiment,  or  the  method  of  choosing  a sample.  Is  of 
Importance  In  all  meteoreloglcal  problems.  Paragraph  5a  of  this  chapter  con- 
tains the  definition  of  the  sample.  However,  In  some  experiments,  although 
the  finite  population  Is  known,  one  must  fabricate  the  random  sample . A table 
of  random  numbers  Is  useful  for  this  purpose.  It  consists  of  numbers  selected 
In  a manner  similar  to  drawing  numbered  slips  of  paper  from  a hat.  The  table 
is  sufficiently  large  so  that  all  numbers  from  zero  through  nine  appear  with 
about  the  same  frequency.  By  combining  numbers,  the  analyst  can  obtain  pairs, 
three  numbers  at  a time,  and  so  forth.  In  using  such  a table,  the  analyst 
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should  be  sure  to  enter  the  table  In  a random  manner.  One  method  is  to  close 
your  eyes  and  place  your  finger  on  a page  of  the  table.  The  digits  under  your 
finger  can  be  used,  or  these  digits  may  be  used  to  locate  others.  For  example 
placing  your  finger  over  ?913  can  be  Interpreted  as  using  the  digits  In  the 
29th  row  of  column  13. 


(1)  In  a recent  problem  Involving  dry  periods  at  an  air  base.  It  was 
necessary  to  resort  to  a table  of  random  numbers  to  fabricate  a random  sample. 
A dry  period  for  this  problem  was  defined  as  a period  during  which  no  precipi- 
tation greater  than  0.10  Inch  occurred  on  any  one  day.  Unfortunately,  con- 
secutive dally  rainfall  amounts  were  not  available,  so  the  lengths  of  dry 
periods  could  not  be  determined.  However,  since  the  average  number  of  days 
per  month  with  precipitation  greater  than  0.10  Inch  was  available,  a table  of 
random  numbers  was  used  to  select  wet  and  dry  days  of  each  month. 


(2)  Using  January  as  an  example,  we  knew  there  were  31  Items  In  the 
universe  and  from  the  record,  we  knew  the  average  January  had  eight  wet  days. 
Consequently,  It  was  necessary  to  use  two-digit  columns  from  the  random  number 
table.  If  the  number  selected  from  the  table  was  31  or  less,  we  called  the 
corresponding  day  a wet  day  until  eight  wet  days  were  selected.  If  the  number 
was  greater  than  31  or  duplicated  a selected  number.  It  was  skipped.  Follow- 
ing thi.s  procedure,  these  days  were  selected  as  wet  days  In  January;  1,  5,  8, 
9,  10,  17,  20,  and  25.  Using  the  same  method,  wet  days  for  each  month  of  the 
average  year  were  selected.  Knowing  the  wet  days.  It  was  an  easy  matter  to 
determine  the  dry  days  and  runs  of  consecutive  dry  days  In  an  average  year. 


(3)  Did  use  of  random  numbers  give  a valid  sample?  Referring  to  the 
actual  data,  the  probability  of  a wet  day  throughout  the  year  was  found  to  be 
0.24  and  the  probability  of  a dry  day  to  be  O.76.  Assuming  no  persistence, 
the  following  data  for  an  average  year  were  computed  (see  Table  13).  Notice 
that  four  of  the  comparisons  agree  exactly,  the  ^numbers  generally  become  less 
as  N Increases  for  both  actual  and  computed  values,  and.  In  most  cases, 
numerical  values  are  comparable.  These  similarities  Indicate  the  fit  of  the 
run  of  dry  days  determined  by  use  of  the  random  number  table  Is  fairly  good 
and  should  be  tested  mathematically.  A X (Chl-square)  test  Indicates  that 
a worse  fit  would  occur  by  chance  as  often  as  one  time  In  two  (see  paragraph 
6g).  Consequently,  wet  days  selected  by  use  of  the  table  of  random  numbers 
represent  a valid  sample. 
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TABLE  13 


Check  of  Results  Derived  by  Random  Numbers  Techniques. 


Actual 

By  Random  Numbers 

N 

Runs  of  N or 

Runs  of  Exactly 

Runs  of  Exactly 

Days 

More  Dry  Days 

N Dry  Days 

N Dry  Days 

1 

67 

16 

12 

2 

51 

13 

14 

3 

38 

9 

9 

4 

29 

7 

7 

5 

22 

5 

3 

6 

17 

4 

4 

7 

13 

3.2 

4 

8 

9.8 

2.4 

4 

9 

7.4 

1.8 

1 

10 

5.6 

1.3 

0 

11 

4.3 

1.0 

1 

12 

3.3 

0.8 

1 

13 

2.5 

0.6 

2 

14 

1.9 

0.5 

0 

15 

1.4 

0.3 

0 

16 

1.1 

0.3 

0 

17 

0.8 

0.2 

1 

6.  The  Normal  Distribution. 

a.  The  normal  distribution  Is  probably  the  most  Important  frequency  dis- 
tribution In  meteorological  statistics.  The  normal  curve  is  symmetrical, 
bell-shaped,  and  extends  Infinitely  in  both  positive  and  negative  directions. 
The  normal  curve  that  best  fits  a particular  sample  of  data  Is  defined  as  that 
curve  with  the  same  area,  mean,  and  standard  deviation  as  the  sample.  The 
equation  of  the  normal  curve  that  best  fits  the  sample  Is; 


(47)  Y ^ e 

s V2ir 

where  the  ordinate  Y equals  the  height  of  the  curve  for  a given  x,  x Is  the 
mean,  and  s the  standard  deviation  of  the  sample,  N Is  equal  to  the  total  num- 
ber of  observations,  1 Is  the  class  Interval,  and  the  area  under  the  normal 
curve  Is  equal  to  the  area  of  the  histogram  of  the  sample. 

b.  In  order  to  make  calculations  easier.  Equation  (hj)  can  be  normalized 
by  letting  2 = (x  - x)/s;  then  z is  normally  distributed  with  mean  equal  to 
zero  and  standard  deviation  equal  to  one.  The  normalized  equation  Is: 
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Y = e ■ ^ ^ 

where  the  area  under  the  curve  Is  equal  to  one . The  curve  defined  by  Equa- 
tion (48)  Is  called  the  normal  density  function  and  values  of  are  given  In 
Table  l4. 

TABLE  14 

Ordinates  of  the  Normal  Cuinre. 


0, 

1. 

2. 

3. 

.0 

.3989 

.2420 

.0540 

.0044 

.1 

.3970 

.2179 

.0440 

.0033 

.2 

.3910 

.1942 

.0355 

.0024 

.3 

.3814 

.1714 

.0283 

.0017 

.4 

.3683 

.1497 

.0224 

.0012 

.5 

.3521 

.1295 

.0175 

.0009 

.6 

.3332 

.1109 

.0136 

.0006 

.7 

.3123 

.0940 

.0104 

.0004 

.8 

.2897 

.0790 

.0079 

.0003 

.9 

.2661 

.0656 

.0060 

.0002 

If  the  ordinate  values  as  given  by  Equation  (47)  for  the  values  of  x at  the 
center  of  each  Interval  (the  class  mark)  are  plotted,  the  smooth  curve  drawn 
through  these  points  Is  the  theoretical  normal  curve  of  best  fit  to  the  data 
sample . 

Example  28;  Fit  a normal  curve  to  the  histogram  of  data  given  In 
Table  5-  N = 21,  1 = 3,  x = 36.5,  and  s = 4.7. 
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TABLE  15  / 


Computation  of  Ordinate  Values. 


Class 

Class 

Mark 

X - X 

3 

Y 

s z 

Observed 

Frequency 

20-22 

21 

-15.5 

3.2979 

.0017 

0.0228  or  0.0 

0 

23-25 

24 

-12.5 

2.6596 

.0118 

0.1581  or  0.2 

0 

26-28 

27 

-9.5 

2.0213 

.0518 

0.6941  or  0.7 

0 

29-31 

30 

-6.5 

1.3830 

.1532 

2.0529  or  2.1 

4 

32-3^ 

33 

-3.5 

0.7446 

.3022 

4.0495  or  4.0 

2 

35-37 

36 

-0.5 

0.1064 

.3968 

5.3171  or  5.3 

8 

38-40 

39 

2.5 

0.5319 

.3463 

4.6407  or  4.6 

2 

41-43 

42 

5.5 

1.1702 

.2012 

2.6961  or  2.7 

4 

44-46 

45 

8.5 

1.8085 

.0788 

1.0560  or  1.1 

0 

47-49 

48 

11.5 

2.4469 

.0202 

0.2707  or  0.3 

1 

50-52 

14.5 

3.0852 

.0035 

0.0469  or  0.0 

0 

c.  Figure  7 shows  the  theoretical  normal  curve,  with  the  same  mean  and 
standard  deviation,  fitted  to  the  data  given  In  Table  15.  The  fit  does  not  ap- 
pear to  be  good,  but  a histogram  can  be  misleading  because  the  frequencies  de- 
pend on  the  selection  of  the  classes  and  the  class  Interval. 


Figure  7.  Normal  Curve  of  Mean  Temperatures, 
January  19^3-1963»  at  Washington 
National  Airport. 

< 
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d.  The  cumulative  normal  distribution  function  gives  the  relative  fre- 
quency (probability)  that  observations  fall  below  (to  the  left  of)  any  specl- 
' fled  value.  The  cumulative  normal  distribution  function,  commonly  called  the  i 

normal  distribution.  Is  defined  by:  j 

1 i 

(49)  p = / e dz  ; 


where,  as  before,  z = ^ ~ It  must  be  remembered  that  Equation  (49)  Is  I 

S I 

normalized,  and  consequently,  the  area  from  minus  Infinity  to  plus  Infinity  Is 
equal  to  one , This  means  that  the  area  between  minus  Infinity  and  any  given  z 
Is  equivalent  to  the  relative  frequency  (probability)  that  the  observation 
falls  below  the  z value.  Tables  of  normal  distribution  function  are  available 
In  most  statistical  textbooks;  however.  It  Is  convenient  to  use  probability  1 

paper  to  determine  the  desired  areas.  i 

I 

e.  The  cumulative  normal  distribution  appears  as  a straight  line  on  nor-  ! 

mal  probability  paper.  The  sample  distribution  can  be  plotted  on  probability  i 

paper  and  the  degree  to  which  the  points  fom  a straight  line  determines  the 
fit  of  the  sample  to  a normal  distribution.  The  normalized  Equation  (49)  Is 
shown  as  a straight  line  on  Figure  8,  where  areas  below  (to  the  left  of) 
specified  z values  are  given  on  the  bottom  scale.  For  example,  the  area 
below  z = -2  Is  0.023,  below  z = 0 Is  0.50,  z = 2 Is  0.978,  etc.,  for  any  z 
value . 

Example  29:  Can  tne  temperature  data  In  Table  5 be  approximated  by  a ' 

normal  distribution?  Temperatures  are  ranked  In  Increasing  order  of  j 

magnitude  and  cumulative  probabilities  are  given  by  P = m/l+N,  where 
m takes  on  values  from  1 to  N,  as  seen  In  Table  l6.  These  tempera- 
tures were  plotted  on  probability  paper  (see  Figure  9)  and  the  fit  to 
a straight  line  Is  quite  good.  Indicating  the  distribution  of  the 
data  given  In  Table  5 Is  probably  normal.  The  straight  line  shown  on 
Figure  9 Is  for  the  cumulative  normal  distribution  with  the  same  mean 
and  standard  deviation  as  our  sample.  j 


995  99  .98  95  .90  . 80  .70  60  .50  .40  .30  . 20  .10  .5  . 2 .1  .005  .001  .0001 


Figure  9.  January  Temperatures  (“F),  19^3-1963  at  Washington  National  Airport 
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TABLE  16  1 


Cumulative  Probabilities. 


Rank 

Temp  (®F) 

m 

1+N 

Rank 

Temp  (*F) 

m 

TW 

1 

29 

.045 

11.5 

36 

.545 

2 

30 

.091 

13.5 

37 

.591 

3.5 

31 

.136 

13.5 

37 

.636 

3.5 

31 

.182 

15 

38 

.682 

5 

33 

.227 

16 

39 

.727 

6 

34 

.273 

17.5 

41 

.773 

3.5 

35 

.318 

17.5 

41 

.818 

8.5 

35 

.364 

19 

42 

.864 

8.5 

35 

.409 

20 

43 

.909 

8.5 

35 

.455 

21 

48 

.955 

11.5 

36 

.500 

The  normal  curve  "best  fitting"  a sample  of  data  Is  found  In  the  fol- 
lowing manner.  Since  the  area  between  -»  and  any  specified  z value  Is  equl-  j 

valent  to  the  probability  (relative  frequency)  that  an  observation  will  fall  j 

below  that  z value,  the  probability  of  an  observation  within  a «olass  Interval  J 

Is  equal  to  the  difference  In  areas  between  the  z values  at  the^ boundaries  of 
the  Interval.  These  areas  can  be  found  using  probability  paper. 

Example  30:  Pit  a normal  curve  to  the  temperature  data  In  Table  5. 

A straight  line  is  drawn  on  probability  paper  (see  Figure  9)  with 
X = 36.5  plotted  at  .50  and  s = 4.7  (36.5  * 4.7)  plotted  at  .159  and 
.841.  (See  computation  In  Table  17.) 

TABLE  17 

j Computation  of  "Best  Fitting"  Normal  Curve. 

i 


1 

2 

3 

4 

Class 

Upper 

Class 

Limit 

Area  i 
Below 

L.  C . L. 

Difference 

In  Areas, 
d 

Theoretical 

Frequency 

Nxd 

Observed 

Frequency 

19.5 

.0002 

.02  (0.0) 

20-22 

22.5 

.0014 

.0012 

0 

23-25 

25.5 

.0090 

.0076 

.16  (0.2) 

.74  (0.7) 

0 

26-28 

28.5 

.0440 

.0350 

0 

2.02  (2.0) 

29-31 

.0960 

4 

f 


November  1968 


AWSP  105-2 


1 


TABLE  17  (Cont'd) 


1 

2 

3 

4 

Upper 

Area 

Difference 

Theoretical 

Observed 

Frequency 

Class 

Class 

Limit 

Below 
Ij«C  •L* 

in  Areas 
d 

Frequency 

Nxd 

29-31 

31.5 

.1400 

.0960 

2.02  (2.0) 

4 

32-34 

34.5 

.3300 

.1900 

3.99  (4.0) 

2 

35-37 

37.5 

.5800 

.2500 

5.25  (5.3) 

8 

38-40 

40.5 

,8050 

.2250 

4.73  (4.7) 

2 

41-43 

43.5 

.9230 

.1270 

2.67  (2.7) 

1.09  (1.1) 

4 

44-46 

46.5 

.9840 

.0520 

0 

.28  (0.3) 

47-49 

49.5 

.9975 

.0135 

1 

.05  (0.0) 

50-52 

52.5 

.9997 

.0022 

0 

g.  The  Chi-square  (X  ) Test  Is  used  to  test  the  "goodness  of  fit"  of  the 

observed  frequencies  (o)  to  the  theoretical  (expected)  frequencies  (e).  The 
2 

X statistic  Is  given  by: 


(50)  X^  = — \ ^ + ■ - ■ - — - + 


I 

J=1 


where  every  ej  should  be  at  least  five,  as  required  by  the  derivation  of  the 

chi-square  distribution.  The  theoretical  and  observed  frequencies  agree 
2 2 

exactly  If  X =0.  If  X * 0,  they  do  not  agree,  and  the  larger  the  value  of 

2 2 
X , the  greater  the  disagreement.  In  fact,  when  X Is  large  r than  certain 

limits,  the  null  hypothesis  (l.e.,  the  observed  frequencies  do  not  differ  sig- 
nificantly from  the  theoretical  frequencies)  Is  rejected,  and  the.  observed 
frequencies  ^ differ  significantly  from  the  theoretical  frequencies  at  the 
usually-accepted  5^  (probability  of  0.05)  level.  The  distribution  depends  on 
the  number  of  degrees  of  freedom  (7) . The  degree  of  freedom  (7)  Is  equal  to 

k - 1 - m,  where  k Is  the  number  of  classes  and  m Is  the  number  of  population 

2 

statistics  estimated  from  the  sample.  Tables  of  X for  different  degrees  of 

freedom  and  various  probability  levels  are  Included  In  most  statistics  books. 

2 

A probability  level  of  0.05  means  that  the  value  of  X for  the  given  number  of 
degrees  of  freedom  will  be  equalled  or  exceeded,  in  random  samples,  only  five 
times  In  100. 
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Example  31 ; Can  the  observed  frequencies  In  Example  30  be  assumed  to 
be  from  a nomal  distribution  at  the  55^  level  of  significance?  (It  is 
necessary  to  combine  some  of  the  classes  so  that  nearly  five  cases  are 
in  each  group.) 

TABLE  18 


Computation  of  Chi-Square. 


Temperature 

(*F) 

< 31 

32-34 

35-37 

38-40 

* 41 

Total 

Frequency 

Observed  (0) 

4.0 

2.0 

8.0 

2.0 

5.0 

21 

Theoretical  (e) 

2.9 

4.0 

5.3 

4.7 

4.1 

21 

Difference  (d) 

1.1 

2.0 

2.7 

2.7 

0.9 

— 

(Difference )^ 
Theoretical 

1.21 

4.0 

THO 

W 

W 

0.81 

— 

= 0.417  + 1.0  + 1.375  + 1.551  + 0.198  = 4.54 

k = 5 and  m = 2 (x  and  s from  sample  used  to  compute  theoretical  fre- 
quencles)  so7=5-  1-2=2.  From  X tables,  X for  7=2  equals 
5.99.  Consequently,  since  4.54  < 5.99»  we  can  assume  that  tempera- 
tures given  in  Table  5 came  from  a normal  population. 

7.  Sampling. 

a.  In  statistics,  the  sample  estimate  that  is  calculated  usually  pos- 
sesses an  appreciable  error  of  estimate.  Therefore,  it  is  Important  to  de- 
termine the  most  probable  value  of  such  statistical  pareimeters  as  the  mean  and 
standard  deviation.  The  mean  for  a series  of  values  Xg,  ...,  X^  is  de- 
fined as 

N 

X = -^  X ^1  Equation  (32)] 

1=1 

and  represents  the  best  estimate  of  the  mean  of  the  population. 

(1)  The  standard  error  of  the  population  mean  for  Independent  data  is 
usually  given  by; 
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where  Is  the  standard  deviation  of  the  variable  estimated  by 

^ ^l)^]  fSee  Equation  (4o)] 


Because  of  the  relative  sparsity  of  most  meteorological  data,  all  consecutive 
values  available  usually  are  used  to  compute  statistics  such  as  the  mean  and 
standard  deviation.  In  general,  there  may  be  persistence  In  the  data  and  each 
value  Is  correlated  with  adjacent  values  (autocorrelation).  The  autocorrela- 
tion, r^,  at  any  time  Interval  of  lag  t can  be  estimated  from  the  sample  by: 

N-t 

r i 5 y (X.-X)(X.  . -X)  [See  Equation  (46)] 

" (N-x)  s/  4 i 


and  the  standard  error  of  the  mean  of  the  autocorrelated  data  becomes: 


o / JTT 

(52)  + 2 Z (N  - t)  r 

* V T=1  ^ 


Since  persistence  Is  probably  the  primary  cause  of  the  autocorrelation  In  most 
meteorological  parameters,  the  autocorrelation  at  lag  t,  r^.  Is  related  to 
that  at  lag  one,  r^,  by: 


(53) 


where  the  autocorrelation  at  lag  one  Is  given  by: 


N-1 


(54) 


"x  1=1 


The  standard  error  of  the  mean  for  autocorrelated  data  caused  primarily  by 
persistence  Is  given  by: 


(55) 


'^V  1-^1  L 


and,  as  given  by  Mitchell  [37]  for  N »•  10  and/or  a small  r^^.  Is  roughly  equiv- 
alent to: 
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I 


(56) 


a- 

X 


A IHH 

>/Tr  V 1 - ’^i 


(2)  The  95/^  confidence  limits  for  the  mean  of  the  population,  as  es- 
timated by  the  sample  mean,  are  given  by: 

(57)  ^(0.95)  = ^ 

where  Is  given  by  Equations  (5l)>  (52),  (55)»  or  (56),  as  appropriate.  The 
955^  confidence  Interval  Is  to  be  Interpreted  In  the  following  manner.  The  In- 
terval, being  a function  of  the  random  variable  x.  Is  Itself  a random  variable 
If  100  samples  were  taken  and  100  95^-confidence  Intervals  were  determined, 
then  we  would  expect  95  of  these  Intervals  to  include  the  population  mean. 

Example  32;  What  are  the  93%  confidence  limits  of  the  mean  of  the 
data  given  In  Table  5?  The  mean  monthly  temperatures  are  Independent; 
therefore,  to  determine  the  standard  error  of  the  mean  of  this  data  we 
use  Equation  (51).  The  mean  Is  given  by: 

* - I I ='l  - ^ - 36-5 

1=1 


The  standard  deviation  Is  given  by: 


^x  = 


1 

N-1 


r N 


/TT 

Cfi  ’'0  J 

28,382  - 2^  (586,756) 


■4 


'441.24 

20 


= '+.7 


The  standard  error  of  the  mean  as  calculated  from  Equation  (51 ) becomes: 


o- 

X 


4.7 

>/2T 


1.03 


The  955^  confidence  limits  of  the  mean  are  given  by  Equation  (57) 

X ± 1.96  = 36.5  ± 1.96  (1.03)  = 36.5  * 2.02  = 34.5  and  38.5 

In  this  case,  the  Interval  34.5*F  to  38.5*F  has  a probability  of  0.95 
of  containing  the  population  mean. 


Example  33;  What  are  the  955^  confidence  limits  of  the  mean  of  the  fol- 
lowing mean  temperatures  for  January  1962  at  Washington  National  Air- 
port (column  X of  Table  19)?  It  Is  assumed  that  persistence  Is  the 
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TABLE  19 


Computation  of  Variance,  Lag  r^,  and  Standard  Error. 


Day 

X 

x2 

(2) 

- 5f) 

(1)  X (2) 

1 

36 

1296 

,36-35 

32-35 

-3 

2 

32 

1024 

^32-35) 

33-35 

.46 

3 

33 

1089 

33-35) 

(47-35 

-24 

4 

47 

2209 

,47-35) 

37-35 

4-24 

5 

37 

1369 

37-35) 

(45-35; 

4-20 

6 

45 

2025 

,45-35 

(48-35) 

4-130 

7 

48 

2304 

48-35 

40-35) 

-465 

8 

40 

1600 

40-35 

(30-35) 

-25 

9 

30 

900 

'30-35 

(19-35) 

4-80 

10 

19 

361 

19-35] 

(16-35) 

4-304 

11 

16 

256 

'16-35] 

(24-35 

4-209 

12 

24 

576 

,24-35, 

(27-35 

4-88 

13 

27 

729 

27-35 

(32-35 

4-24 

14 

32 

1024 

32-35, 

(46-35 

-33 

15 

46 

2116 

'46-35] 

(35-35' 

0 

16 

35 

1225 

[35-35] 

(34-35, 

0 

17 

34 

1156 

34-35, 

(29-35 

46 

18 

29 

841 

29-35 

(30-35 

+30 

19 

30 

900 

30-35 

33-35 

+10 

20 

33 

1089 

33-35] 

(35-35] 

0 

21 

35 

1225 

[35-35] 

(46-35] 

0 

22 

46 

2116 

'46-35 

42-35 

+77 

23 

42 

1764 

42-35, 

35-35 

0 

24 

35 

1225 

,35-35, 

44-35 

0 

25 

44 

1936 

'44-35] 

(42-35] 

463 

26 

42 

1764 

[42-35 

(44-35] 

+63 

27 

44 

1936 

44-35) 

33-35' 

-18 

28 

33 

1089 

33-35) 

(30-35, 

+10 

29 

30 

900 

30-35) 

(37-35' 

-10 

30 

37 

1369 

37-35) 

(22-35 

-26 

31 

22 

484 

" 

— 

— 

— 

1083 

39897 

Z 

— 

1070 

I 

[ primary  cause  of  the  autocorrelation,  so  Equation  (56)  Is  used  to  com- 

i pute  the  standard  error  The  mean  X Is  determined  by 

^ ^1  “ 

1-1 

p 

The  variance,  , and  the  standard  deviation,  s^^,  are  found  by  using 
Equation  (4o). 
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■ [ I ’‘i"  - UI  ’‘OT  ■ w ■ ’51 

1=1  1=1 


= 3J  [39,897  - 37,835]  = 


2062 

~lo~ 


= 68.7 


3^  = *y68.7  = 8.3 


The  autocorrelation  coefficient  for  a lag  of  one  day,  r^^,  using  Equa- 
tion (46): 


N-1 


’’1  = 


(N-1)  s. 


I »i  - - V ■ (joHfeB.77 


1=1 


= i6T  = °-5l9 

The  standard  error  of  the  mean,  c^,  using  Equation  (56): 


= F ^ r/3.158  = 2.648 


The  955^  confidence  limits  of  the  mean  are  given  by  Equation  (57): 

3f  * 1.96  = 35  * 1.96  (2.648)  = 35.00  ± 5.19  = 29.81  and  40.19 

Therefore,  the  955^  confidence  limits  of  the  population  mean  for  the 
autocorrelated  data  Is  29.8*F  and  4o.2*F. 

b.  For  an  autocorrelated  series  of  N terms  the  analyst  determines  the  ef- 
fective ntimber,  N',  of  Independent  terms.  N',  when  used  In  Equation  (51), 
gives  the  standard  error  of  the  mean  of  the  autocorrelated  series.  Equating 
Equation  (52)  with  Equation  (51)  gives 


(58) 


>1 

N'  = N [1  + I J (N  - t)  r^J 


T*1 


To  find  the  standard  error  of  the  mean  of  persistent  series.  Equation  (55)  or 
(56)  Is  used.  Therefore,  what  number  N',  when  substituted  In  Equation  (51), 
will  give  a value  for  equal  to  the  correct  value  given  by  Equation  (55)  or 
(56)?  Equating  Equation  (55)  with  (51)  gives 

N 


r 2 r-  ✓ , 1-r,  \-]-l 

(59)  N.  =n[i 
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and  for  N » 10  and/or  a small  r^,  we  find  from  Equation  (59) 

(60)  N-  =N 

Example  3^ t What  are  the  effective  number  of  Independent  dally  tem- 
peratures In  the  data  series  given  In  Example  33?  Since  N > 10,  It  Is 
appropriate  to  use  Equation  (60). 

) - 31  (r;-§:-iil)  - 5-8  - 1“ 

Therefore,  the  temperatures  on  every  third  day  are  Independent  of  each 
other. 

c.  After  calculating  a sample  standard  deviation,  s^,  the  analyst  may 
wish  to  determine  If  Its  value  Is  significantly  different  from  the  population 
value,  o . The  confidence  bands  of  the  standard  deviation  can  be  found  by 

X 2 

means  of  the  chi-square,  X , test.  The  test  consists  of  verifying  whether 


where  and  Xg  are  the  lower  and  upper  limits,  respectively,  of  the  chi- 
square  distribution  (N  - 1 degrees  of  freedom)  appropriate  to  a desired  confi- 
dence level,  e.g..  If  s^  Is  computed  from  an  autocorrelated  series  then  the 
test  must  be  modified  by  replacing  N by  N'  from  Equations  (58),  (59),  or  (60) 
and  the  degrees  of  freedom  taken  as  N'  - 1. 

Example  35:  Find  the  95^  confidence  limits  of  the  standard  deviation 
for  the  temperature  data  In  Table  5.  Since  the  data  are  Independent, 

It  Is  appropriate  to  use  N = 31  In  Equation  (61)  and  Xo  and  Xqv 
for  N - 1 = 30  degrees  of  freedom  with  s^  = 22.1  to  determine  the 
95/^  confidence  level. 

^40. 8 * * '/r5T6~ 

6.4  * * 3.8  for  s^  » 4.7 

The  955^  confidence  limits  are  3.8*F  and  6.4*F. 

Example  36;  Find  the  955^  confidence  limits  of  the  standard  deviation 
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of  the  autocorrelated  data  given  In  Example  33.  Since  these  data  are 
autocorrelated.  It  Is  appropriate  to  use  N'  = 10,  as  found  In  Example 
34  by  Equation  (60),  and  Xg  ^ and  Xgy  ^ for  N'  - 1 = 9 degrees  of 
freedom  with  s^^  = 68.7  to  find  the  95^  confidence  level. 

nmmir  > a > 

\ 2.70  X \ 19.02 

V254.4  \ * >/55TB” 

15.95  * 

Therefore,  the  Interval  6*F  to  16*F  has  a probability  of  0.95  of  con- 
taining tne  population  standard  deviation. 

8.  Statistical  Inference. 

a.  Since  dealing  with  populations  themselves  Is  generally  Impractical  In 
statistics,  absolute  proofs  of  statements  about  populations  are  rarely  possi- 
ble. Therefore,  an  Investigator  must  use  statistical  tests  to  estimate  the 
accuracy  of  his  statements.  One  such  test  Is  the  use  of  statistical  Inference. 

b.  Consider  any  population.  Suppose  that  a value,  "h, " has  been  used  for 
a parameter  of  a population  and  that  an  analyst  wants  to  test  this  value.  He 
has  two  ways  of  making  a wrong  decision.  He  may  reject  "h"  when  It  Is  right 
(called  the  Type  I or  a error),  or  he  may  accept  "h"  when  It  Is  wrong  (called 
the  Type  II  or  P error).  Statistical  Inference  enables  him  to  limit  a,  the 
chance  of  rejecting  a correct  hypothesis,  to  any  percent  he  chooses.  Once  a 
Is  chosen,  he  can  also  determine  p,  the  chance  of  accepting  an  Incorrect 
hypothesis.  For  any  particular  a,  the  value  of  P will  depend  on  how  wrong  the 
hypothesis  is.  Note  that  this  test  does  not  tell  him  the  chance  of  the  hy- 
pothesis being  right  or  wrongl  It  tells  him  only  the  chance  of  rejecting  It 
when  It  Is  right,  or  accepting  It  when  It  Is  wrong.  In  general,  as  a Is  de- 
creased, p will  be  Increased.  The  analyst  usually  decides  how  high  he  wants  a 
to  be  and  sets  up  his  test  based  on  this  decision.  If  he  wants  a to  be  55^,  he 
Is  said  to  have  set  up  a level  of  significance  of  5?^. 

c.  To  apply  the  test,  we  take  a sample  and  compute  from  It  the  parameter 
In  question  and  the  standard  error,  of  this  parameter.  Assume  the  sample 
distribution  of  most  parameters  to  be  normal  or  nearly  so.  Using  the  hypothe- 
sis "h"  as  the  mean  of  the  parameter  In  question  and  the  standard  error  of 
this  parameter  [paragraph  7a (l)  of  this  chapter],  we  draw  the  sample  distri- 
bution as  a straight  line  on  normal  probability  paper.  A 55f  level  of  signif- 
icance means  that  the  hypothesis  will  be  accepted  if  the  sample  value  of  the 
parameter  falls  within  the  2.5  through  97.5^  ("h"  ± 1.96  a^)  limits  on  our 
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distribution.  In  other  words,  ^ our  hypothesis  Is  correct,  the  analyst  will 
accept  It  955^  of  the  time  because  955^  of  the  sample  values  will  fall  within 
the  limits.  Suppose,  however,  that  the  hypothesis  Is  wrong,  and  the  value  of 
the  population  parameter  Is  not  "h"  but  "m."  The  sample  distribution  of  the 
parameter  Is  still  normal  and  has  the  same  standard  error,  but  It  now  has  a 
mean  of  "m."  The  probability  of  occurrence  of  "h  ± 1.96  on  our  new  dis- 
tribution Is  p. 

E2cam£]^_27:  Define  a population  as  the  monthly  temperatures  for  all 
years  at  Washington  National  Airport.  Suppose  that  38*P  has  been  used 
as  the  mean  of  these  mean  monthly  temperatures.  Test  this  value  to 
decide  whether  or  not  to  accept  It.  For  this  example,  set  up  a level 
of  significance  of  55^  so  that  If  the  38*F  Is  correct,  we  will  accept 
It  95?^  of  the  time. 

(a)  Use  the  sample  years  19^3-1963  to  decide  If  the  hypothesis 

should  be  accepted  or  rejected.  The  mean  (x)  of  this  sample  Is  com- 
puted to  be  36.5*F  and  s,  its  standard  deviation.  The  standard 

error  of  the  mean  of  the  population  Is  given  by 

o—  = — ^ [See  Equation  (51)1 

* 

where  Is  the  standard  error  of  the  mean,  N Is  the  number  In  the 
sample,  and  a (the  standard  deviation  of  the  population)  Is  approxi- 
mated by  s.  Is  computed  to  be  1.03  (see  Example  32).  Using  as  the 
mean,  the  hypothesis  38“,  and  = 1.03,  draw  the  sample  distribution 
as  a straight  line  on  normal  probability  paper  (Line  "A"  on  Figure  lo). 
Note  that  a level  of  significance  of  5?^  represents  a value  of  36.0 
(h  - 1.96  07)  at  the  2.5^  limit  and  a value  of  40.0  (h  + I.96  a—)  at 
the  97.5^  limit.  Any  sample  value  between  these  limits  causes  the 
analyst  to  accept  the  hypothesis.  Since  the  sample  value  Is  36.5“, 
he  would  accept  38“. 

(b)  What  Is  the  chance  of  his  accepting  this  hypothesis  If  It  Is 
1,  2,  or  3“  off?  Using  the  same  a^,  but  this  time  39“  as  the  mean, 
draw  another  sample  distribution  (Line  "B"  on  Figure  10).  The  percent- 
age of  time  that  the  values  36.0“  through  40.0“  occur  on  this  distri- 
bution Is  the  B error.  In  this  case,  the  p error  Is  835?;  minus  *»  0.25^ 
or  82.8^.  In  other  words,  the  analyst  runs  an  82.85^  chance  of  accept- 
ing 38“,  even  If  the  actual  mean  Is  one  degree  off,  l.e.,  either  37° 

or  39“.  Similarly,  P for  two  degrees  off  would  be  505^  (Line  "C"  on 
Figure  10),  and  for  three  degrees  off  would  be  \Ojl>  (Line  "D"  on 
Figure  10 ) . 
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Figure  10.  Sample  Distribution  of  the  Mean  of  the  Mean  Monthly  Temperatures 
for  Washington  National  Airport. 
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9.  Regreselon  and  Correlation. 

a.  In  meteorology  and  climatology  it  Is  often  necessary  to  determine  If 
two  or  more  weather  factors  or  parameters  are  related  and.  If  so,  how  they  are 
related.  A relationship  between  two  or  more  parameters  Is  called  correlation. 
An  equation  that  can  be  used  to  predict  one  parameter  from  Information  about 
the  other  and  to  estimate  the  range  of  the  error  In  the  prediction  can  be  de- 
rived from  this  correlation. 

b.  If  data  for  two  variables  are  plotted  on  a scatter  diagram  (see  Figure 
11)  and  appear  to  fall  along  a straight  line,  a linear  equation  probably  gives 
the  best  relationship.  A line  of  regression  Is  the  line  of  best  fit  of  the 
data  and  Is  given  by  the  slope-intercept  equation  of  the  straight  line, 

(62)  Y = m X + b 


where  m Is  the  slope  with  respect  to  the  X axis  and  b Is  the  Y-lntercept. 
Usually  the  equation  of  the  line  Is  found  by  the  method  of  least  squares, 
which  gives  a line  such  that  the  sum  of  the  squares  of  the  deviations  of  the 
observed  values  from  the  line  Is  a minimum.  The  constants,  m and  b,  are  de- 
termined by  solving  simultaneously  the  following  two  normal  equations  of  the 
least  square  line: 


(63)  2Y=m2X+bN 

(64)  2XY=mSX^+bSX 


(65) 


b 


2Y  SX^  - 2X  Sp 
NSX^  - (SX)^ 


(66) 


m 


2Y  - Nb 

“IS — 


Equations  (65)  and  (66),  when  substituted  Into  the  slope-intercept  Equation 
(62),  give  the  line  of  best  fit  for  the  regression  of  Y on  X by  the  least 
square  method. 


c.  The  standard  error  of  estimate,  which  Is  analogous  to  the  standard  de- 
viation, Is  a measure  of  the  scatter  about  the  line  of  regression  of  Y on  X 
and  Is  given  by 


(67) 


®Y*X 


bZY-mZXY 
f3 


If  lines  are  drawn  parallel  to  the  regression  line  of  Y or  X at  distances 
equal  to  Sy^x  below  the  line  measured  In  the  Y direction,  about  685^ 

of  the  data  points  fall  between  the  two  lines  (Figure  11). 


4-35 


November  1968 


AWSP  105-2 


TABLE  20 


Computation  of  Coefficients,  Standard  Error 
of  Estimate,  and  Correlation  Coefficient. 


X 

Y 

XY 

x2 

y2 

1 

0.33 

0.26 

.0858 

.1089 

.0676 

2 

0.16 

0.07 

.0112 

.0256 

.0049 

3 

0.43 

0.54 

.2322 

.1849 

.2916 

4 

0.42 

0.42 

.1764 

.1764 

.1764 

5 

0.13 

0.03 

.0039 

.0169 

.0009 

6 

0.22 

0.24 

.0528 

.0484 

.0576 

7 

0.43 

0.75 

.3225 

.1849 

.5625 

8 

0.27 

0.33 

.0891 

.0729 

.1089 

9 

0.28 

0.33 

.0924 

.0784 

.1089 

10 

0.18 

0.10 

.0180 

.0324 

.0100 

11 

0.18 

0.09 

.0162 

.0324 

.0081 

12 

0.38 

0.33 

.1254 

.1444 

.1089 

13 

0.l4 

0.07 

.0098 

.0196 

.0049 

14 

0.30 

0.29 

.0870 

.0900 

.0841 

15 

0.50 

0.63 

.3150 

.2500 

.3939 

16 

0.37 

0.15 

.0555 

.1369 

.0225 

17 

0.46 

0.77 

.3542 

.2116 

.5929 

18 

0.48 

0.57 

.2736 

.2304 

.3249 

19 

0.35 

0.46 

.1610 

.1225 

.2116 

20 

0.38 

0.55 

.2090 

.1444 

.3025 

21 

0.39 

0.48 

.1872 

.1521 

.2304 

22 

0.29 

0.46 

.1334 

.0841 

.2116 

23 

0.22 

0.05 

.0110 

.0484 

.0025 

24 

0.32 

0.08 

.0256 

.1024 

.0064 

25 

0.16 

0.10 

.0160 

.0256 

.0100 

26 

0.36 

0.32 

.1152 

.1296 

.1024 

27 

0.30 

0.13 

.0390 

.0900 

.0169 

28 

0.24 

0.07 

.0168 

.0576 

.0049 

29 

0.45 

0.63 

.2835 

.2025 

.3969 

30 

0.15 

0.03 

.0045 

.0225 

.0009 

31 

0.34 

0.33 

.1122 

.1156 

.1089 

32 

0.36 

0.43 

.1548 

.1296 

.1849 

33 

0.34 

0.37 

.1258 

.1156 

.1369 

— 

10.310 

10.460 

3.9160 

3.5875 

4.8602' 

between  the  parameters. 

Example  38;  Is  there  a relationship  between  the  percentage  frequency 
of  precipitation  rate  ? 0.25  Inch  per  hour  and  the  ratio  of  annual 
precipitation  (inches)  and  days  per  year  with  measurable  precipita- 
tion (?  0.01  Inch)?  Data  were  compiled  for  33  U.S.  Weather  Bureau 
first-order  weather  stations  (Table  20)  with: 
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Annual  precipitation  (inches) 

^ “ Number  of  days  with  measurable  precipitation 

Y = Percentage  frequency  of  precipitation  rate  i 0.25  Inch  per 
hour 

Values  for  X and  Y are  plotted  on  a scatter  diagram  (Figure  11 ) and 
the  points  form  a straight  line . 

„ SY^  - 4 (2Y)^  4.8602  - ^ (10.46)^ 

N = 33,  Sy  jj-r-I 52 = 0.048276 

_ 2Y  - SX  ZXY  _ (10.46)(3.5875)  - ( 10. 31 ) (3.9l6o)  ^ g 

■ - (ac)^  ■ (33)(3.5875)  - (10.31)^ 

„ . 24^  . - (33)^-0.g3??98)  . 1^^656 


Y = m X + b 


Line  of  Regression 
Y = 1.789  X - 0.235b 


Standard  Error  of  Estimate 


®Y<»X 


2Y^  - bSY  - mSXY 
N 

4. 0602  - (-O.23559B)  (10.46)  - (1.7686)('3r9lST~ 


0.109895 


Linear  Correlation  Coefficient 


L 0.0120769 

/ ^ ~ 0.04«276 


0.87 


Iherefore,  there  appears  to  be  a good  relation  between  the  ratio  and 
the  percentage  frequency  of  a precipitation  rate  of  s 0.25  Inches  per 
hour. 


Example  39;  What  Is  the  relationship  between  the  ratio  given  In  Exam- 
ple 38  (annual  precipitation  In  Inches  dlvloed  by  nxjmber  of  days  with 
measurable  precipitation)  when:  (a)  the  percentage  frequency  of  pre- 
cipitation rate  5 0.50  Inches  per  hour,  and  (b)  the  percentage  fre- 
quency of  precipitation  rate  * 1.00  Inches  per  hour.  The  computations 
yield  the  following: 


First,  ratio  versus  percentage  frequency  of  precipitation  rate  * O.50 
Inches  per  hour. 
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Line  of  ReKresslon 

Y = 0.7b3  X - 0.122 

' Standard  Error  of  Estimate  1 

Syox  “ 0.0568 

Linear  Correlation  Coefficient 
r = 0.«2 

Second,  ratio  versus  percentage  frequency  of  precipitation  rate  s 1.00 
Inches  per  hour. 

Line  of  Regression 

Y = 0.225  X - 0.03«« 

Standard  Error  of  Estimate 
ayox  = 0.0230 

Linear  Correlation  Coefficient 
r = 0.72 

e.  Some  nonlinear  equations  can  be  reduced  to  linear  form.  After  the 
[ data  for  two  variables  are  plotted,  it  may  be  apparent  that  a straight  line 

i does  not  fit  the  data,  but  a curve  might.  The  data  may  then  be  plotted  on 

' log-log  graph  paper  or  semi-log  graph  paper. 

t (1)  If  a straight  line  results  on  log-log  paper  the  equation 

i 

I (69)  Y = uX'^ 

r satisfies  the  data.  Equation  (69)  can  be  reduced  by  taking  logarithms  of  both 

I sides  of  the  equation  and  the  resulting  equation  becomes: 

! 

\ (70)  log  y = log  U + V log  X 


Setting  log  Y = Y',  log  u = u',  and  log  X = X',  the  resulting  linear  equation 
becomes 


(71)  Y'  = u'  + vX' 

As  seen  by  Equation  (70),  the  logarithms  of  Y and  X are  used  to  calculate  the 
slope  (v)  and  the  Y-lntercept  (log  u)  of  Equation  (71 ).  After  the  line  of 
best  fit  Equation  (71)  has  been  derived,  it  is  transformed  back  to  Equation 
(69)  form. 

(2)  If  a straight  line  on  log-log  paper  Is  not  obtained,  the  data  can 
be  plotted  on  semi-log  paper.  First,  with  X on  the  linear  scale  and  Y on  the 
log  scale.  If  the  result  Is  not  a straight  line,  reverse  the  procedure.  If  a 
straight  line  results  from  either  of  the  two  graphs,  then  Equation 

4-39 


AWSP  105-2 


November  1968 


(72)  Y = uv^  (X  on  the  linear  scale) 

or 

(73)  X = uv^  (Y  on  the  linear  scale) 

will  satisfy  the  data.  Equations  (72)  and  (73)  can  he  reduced  by  taking  loga- 
rithms of  both  sides. 

(74)  log  Y = log  u + X log  V 

Setting  log  Y = Y',  log  u = u',  and  log  v = v',  the  resulting  linear  equation 

becomes 

(75)  Y'  = u'  + V'  X 

Logarithms  are  used  for  the  dependent  variable  and  the  actual  value  for  the 
Independent  variable  to  calculate  the  slope  (log  v)  and  the  Y-lntercept 
(log  u).  The  line  of  regression.  Equation  (75) » Is  then  derived  and  trans- 
fonned  to  the  exponential  form.  Equation  (72)  or  (73)^  whichever  Is  appro- 
priate . 

f.  The  calculated  coefficient  (r)  Is  only  a sample  statistic.  The  popu- 
lation correlation  coefficient  (p)  Is  needed,  but  Is  not  known.  However,  con- 
fidence limits  can  be  calculated  by  using  the  Fisher's  Z transformation  [ll]. 

(76)  Z = [£n  (1  + r)  - In  (1  - r)] 

Values  for  Z are  given  In  Table  21.  Equation  (76)  Is  approximately  normally 
distributed  with  mean 

(77)  ^2  = [in  (1  + p)  - in  (1  - p)) 
and  standard  deviation 


Example  40;  What  are  the  confidence  limits  for  the  correlation  coef- 
ficient of  0.87  with  N = 33  given  In  our  computation  example?  The  955^ 
confidence  limits  for  u^  are  given  by 

Z ± 1.96  ^2;  = ^ 0-87)-  in  (1  - 0.87)]  i 

= 1.333  i 1.96  X = 0.973  and 
If 


1.693 
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[ 

I ^ (1  + P)  - (1  - P)1  = 0.973,  P = 0.75 

I ' and  if, 

! '^Z  = ^ (1  + P)  - (1  - P)]  = 1-693,  P = 0.93 

Therefore,  the  955^  confidence  limits  for  the  population  correlation 
coefficient  p are  0.73  and  0.93  corresponding  to  the  sample  correla- 
tion coefficient  of  0.87. 

TABLE  21 


Fisher's  Z Transformation. 


r 

0.00000 

0.01000 

0.02000 

0.03000 

0.04000 

0.00000 

0.10000 

0.20000 

0.30000 

0.40000 

0 . 50000 
0.60000 
0.70000 
0.80000 
0.90000 

0.00000 

0.1003^ 

0.20273 

0.30952 

0.42365 

0.54931 

0.69315 

0.86730 

1.09861 

1.47222 

0.01000 

0.11045 

0.21317 

0.32055 

0.43561 

0.56273 
0.70892 
0.88718 
1.12703 
1.52752 
1 

0.02000 

0.12058 

0.22366 

0.33165 

0.44769 

0.57634 

0.72501 

0.90764 

1.15682 

1 . 58903 

0.03001 

0.13074 

0.23419 

0 . 34283 
0.45990 

0.59015 

0.74142 

0.92873 

1.18814 

1.65839 

0.04002 

0.14093 

0.24477 

0.35409 

0.47223 

0.604l6 

0.75817 

0.95048 

1.22117 

1.73805 

r 

0.05000 

0.06000 

0.C7000 

0.08000 

0.09000 

0.00000 

0.10000 

0.20000 

0.30000 

0 . 40000 

0.50000 

0.60000 

0.70000 

0.80000 

0.90000 

0.05004 

0.15114 

0.25541 

0.36544 

0.48470 

0.61838 

0.77530 

0.97296 

1.25615 

1.83178 



0.06007 

0.16139 

0.26611 

0 . 37689 
0.49731 

0.63283 

0.79281 

0.99622 

1.29334 

1.94591 

0.07011 

0.17167 

0.27686 

0.38842 

0.51007 

0.64752 

0.81074 

1.02033 

1.33308 

2.09230 

0.08017 

0.18198 

0.28768 

0 . 40006 

0.52298 

0.66246 

0.82911 

1.04537 

1 . 37577 
2.29756 

0.09024 

0.19234 

0.29857 

0.41180 

0.53606 

0.67767 

0.84796 

1.07143 

1.42193 

2.64665 

NOTE:  Z Is  negative  when  r Is  negative. 


10.  Time  Series. 

a.  Periodicity  of  heavenly  bodies  Is  common  knowledge.  Night  follows  day 
and  seasons  change  regularly.  These  periodicities  are  reflected  In  surface 
temperature  changes  that  do  not  follow  an  exact  period;  yet,  the  probability 
of  certain  temperature  changes  tend  to  be  periodic.  Hamonlc  Analysis  and 
Spectrum  Analysis  Isolate  those  portions  of  the  march  of  a meteorological 
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variable  such  as  temperature,  celling  height,  and  wind  SF)eeds,  to  name  only  a 
few.  While  only  a brief  discussion  of  harmonic  and  spectrum  analysis  Is  In- 
cluded In  this  pamphlet,  those  readers  desiring  further  Information  concerning 
these  subjects  are  referred  to  the  complete  discussion  contained  In  "Handbook 
of  Statistical  Methods  In  Meteorology"  [11]  and  "Some  Applications  of  Statis- 
tics to  Meteorology"  [38]. 

b.  Harmonic  Analysis  generates  a series  of  cosine  functions  that,  when 
summed  together,  equal  the  original  data  of  the  series.  A periodicity  may  be 
slmole,  composed  of  a single  cosine  curve,  or  may  be  more  complex,  containing 
many  harmonics.  According  to  mathematical  principles,  any  function  that  Is 
defined  at  every  point  In  the  Interval  can  be  represented  by  an  Infinite 
series  of  sine  and  cosine  functions.  A Fourier  Series,  established  by  the 
Fourier  Analysis,  represents  this  function.  In  the  case  of  meteorological 
data,  only  a finite  number  of  discrete  points  exists,  so  a finite  number  of 
harmonics  (half  the  number  of  observations)  will  account  for  all  the  variation. 
For  practical  purposes,  however,  the  first  two  or  three  harmonics  account  for 
most  of  the  variation,  and  physical  meaning  can  be  generally  attributed  to 
these  harmonics.  Higher  harmonics  are  reflections  of  "noise"  In  the  observa- 
tions or  are  purely  of  mathematical  Importance.  The  value  X at  time  tj  equals 
the  mean  Y plus  the  sum  of  all  N/2  harmonics,  thus: 


(79) 


= X + 


■f[-, 

1=1 


Sin 


f ‘ 9) 


(1^ 


+ Cos 


(I"  ^ ‘.)] 


where  P Is  the  fundamental  or  total  period  of  the  function  (the  total  length 
of  record  and  not  necessarily  equal  to  N) . For  simplicity  of  writing,  this 
series  can  be  written  as: 


(80) 


= Y + 


N^2 


1=1 


h 


Cos  r|i  1 


- 


‘^maxj^^]  } 


where 

(81) 


and : 

(82) 


where  the  trigonometric  functions  give  the  same  angular  value.  (On  some  occa- 
sions, the  angles  are  double-valued. ) This  comparison  assures  the  proper 
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quadrant  of  the  determined  angle,  which  Is  the  time  of  the  maximum  for  that 
particular  harmonic.  The  problem  Is  to  compute  and  from  which  all  In- 
formation can  be  obtained.  These  are  calculated  from; 


(83) 


(84) 


^ I [*J 

J=1 

^ I bs 

j=i 


sin 


Cos 


‘ ‘0] 


(|iit 


.)] 


The  origin  of  time  Is  lisriaterlal  as  the  final  harmonic  series  will  always  sum 
to  the  original  data  with  changes  to  A. , B, , and  t . How  many  harmonics 

1 1 ITscLX  ^ 

are  needed?  This  can  best  be  answered  by  calculating  the  variance  of  each 
harmonic  (each  of  which  la  Independent)  and  forming  a ratio  of  the  variance 
explained  by  that  harmonic  to  the  total  variance  of  X.  While  all  harmonics 
explain  1005^  of  the  variance,  the  first  three  harmonics  usually  account  for 

O 

905^  i-r  more.  The  variance  for  each,  except  the  N/2  harmonic,  la  C,  /2.  For 

2 ^ 

the  N/2  harmonic,  the  variance  Is  C,  . Thus,  the  percentage  variance  of  the 
1 harmonic  Is: 


(85) 


100  X 


(N  - 1) 


Example  4l;  Consider  the  mean  number  of  January  days  that  paradrop 
criteria  are  met  for  each  hour  at  Seymour-Johnson  AFB.  The  mean  num- 
ber of  days  at  each  hour  Is  Influenced  by  the  three  limiting  meteoro- 
logical parameters  In  the  paradrop  criteria  — celling  s 2000  feet, 
visibility  s 3 miles,  and  surface  winds  < 10  knots.  Mean  nxxnber  of 
days  of  paradrop  criteria  for  January  at  each  hour  (local  time)  at 
Seymour-Johnson  AFB  are  given  In  Table  22.  In  this  example,  E = 24 
since  the  time  units  are  'hours'  and  the  total  time  Is  one  day  (24 
hours).  By  coincidence,  there  Is  an  observation  for  each  hour  so  here 
we  find  that  N = 24.  To  generate  the  first  harmonic,  we  have: 


A,  = 


24 

^ I [’'j 

J=1 


Sin 


(I?  "^j)] 
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January  Paradrop  Criteria. 


Hour 

"t" 

Mean  No. 
of  Days 
"X" 

Hour 

"t" 

Mean  No. 
of  Days 
"X" 

Midnight 

21.6 

Noon 

15.6 

1 

21.1 

13 

15.7 

2 

21.2 

14 

16.2 

3 

20.8 

15 

16.5 

4 

20.3 

16 

18.3 

5 

20.4 

17 

20.5 

6 

20.0 

18 

23.0 

7 

19.2 

19 

23.1 

8 

20 

23.4 

9 

18.0 

21 

22.4 

10 

17.4 

22 

22.4 

11 

17.5 

23 

21.5 

I h ^j)] 

J=1 

Considering  only  a few  sine  terms,  we  have: 

Term  1 = 21.6  x Sin  (O)^  = 0.00 

Term  2 = 21.1  X Sin  ^ (l)^  = 5.45 

Term  3 = 21.2  X Sin  (2)  = 10.55 

• • 

■ • 

• • 

Term  24  = 21.5  x Sin  (23)j  = -5.56 

After  obtaining  every  and  tj^,  the  remainder  of  the  calculations 

can  be  completed: 

C^  = 2.901  days  = -.910  hours  ^ vruce^^  = 74.6 

Cg  = 1.492  days  ° 7.223  hours  % vmceg  = 19.7 

C^  = 0.589  days  T^^  = 2.383  hours  ^ vmce^  = 3.1 
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The  first  three  harmonics  account  for  97. *>5^  of  the  variance.  The 
actual  times  of  the  maximum  for  the  various  harmonics  are  calculated 
from  midnight,  the  origin  of  the  time  for  this  example. 


T 

max^ 

T 

maxg 

T 

max^ 


2305 

0713 

0223 


A graph  of  the  mean  number  of  days  (observed),  along  with  a sketch  of 
three  harmonics.  Is  shown  In  Figure  12. 


c.  Spectrum  analysis  Indicates  whether  a harmonic  Is  Important  or  unim- 
portant. The  graph  of  the  percent  variance  of  each  harmonic  to  the  harmonic 
number  Is  the  spectrum  of  that  data.  The  spectrum  of  a time  series  shows  the 
contributions  of  the  various  frequencies.  Figure  13  presents  the  spectra  for 
all  12  harmonics  of  the  time  series  of  the  data  In  Example  4l.  Only  the  first 
three  harmonics  have  a significant  percent  of  the  variance  associated  with 
them.  From  the  4th  and  higher  harmonics,  the  variance  Is  more  likely  a re- 
flection of  noise  In  the  data.  The  actual  determination  of  spectrum  analysis 
Is  a harmonic  analysis  of  the  autocorrelation  resulting  In  a smoothing  of  the 
calculated  variances.  The  procedure  for  calculating  the  spectrum  analysis  Is 
quite  straightforward.  First,  lags  from  0 to  m (where  m = P/2)  are  determined. 
Since  the  autocorrelation  r^  between  hour/A  and  hour/B  is  exactly  the  negative 
of  the  autocorrelation  between  hour/B  and  hour/A,  a symmetrical  function  re- 
sults with  the  coefficients  of  all  sine  terms  being  zero  (o).  Thus,  where  2m 
Is  the  fundamental  period  and  r^^  Is  the  autocorrelation  of  the  coefficient  of 
lag  L,  the  actual  calculation  becomes: 


(86) 


(-1)' 


Cos 


In  the  case  of  Bq  and  B^  the  coefficients  resulting  from  the  formula  must  be 
divided  by  two.  If  B^,  Is  plotted  as  a function  of  1,  the  resulting  curve  Is 
a smoothed  version  of  the  original  series  and  Is  scaled  (or  normalized).  A 
better  estimate  of  the  smoothed  spectrum  function  Is  obtained  by  weighting 
the  coefficients: 


(87)  M B^  + B^^^ 


NOTE:  The  Wj^'s  are  superimposed  over  the  spectrum  of  the  variances 
in  Figure  13. 
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A spectrum  analysis  Is  useful  In  understanding  the  physical  principles  In- 
volved In  the  variations  of  the  data.  In  these  data.  It  Is  obvious  that  the 
dally  cycle  Is  t^ic  primary  Influence  on  the  variation.  In  other  data  sets, 
significant  maxima  and  minima  are  Important  even  If  they  do  not  occur  as  sharp 
oeaks  or  troughs,  since  *hese  Indicate  the  likelihood  or  unlikelihood  of  vari- 
ations near  certain  periods. 

11 . Converting  Percentage  Frequency  to  Days. 

a.  The  mean  number  of  days  per  month  of  the  occurrence  of  a phenomenon 
(such  as  thunderstorm,  fog,  rain,  or  wind)  for  a specific  hour  can  be  deter- 
mined directly.  However,  If  the  mean  number  of  days  at  any  hour  Is  needed.  It 
must  be  estimated,  because  It  depends  on  the  mean  duration  of  the  phenomenon. 

b.  To  convert  percentage  frequency  to  mean  n\miber  of  days  at  a specific 
hour,  simply  multiply  the  probability  (percentage  frequency/lOO)  at  the  spe- 
cific hour  by  the  number  of  days  In  the  particular  month. 

(88)  M = Np 

where  N Is  the  number  of  days  In  the  month  and  p Is  the  probability,  at  the 
specific  hour. 

Example  42;  What  Is  the  mean  number  of  days  per  month  that  a thunder- 
storm will  occur  at  Saigon,  South  Vietnam  at  1600  LST  In  June  If  the 
percentage  frequency  of  occurrence  at  this  hour  Is  3.7/^?  Using  Equt 
tlon  (88) 

S 7 

M = 30  X = 1.1  day  per  month 

f c.  To  convert  percentage  frequency  (all  hours)  to  mean  number  of  days  at 

any  hour.  It  Is  necessary  first  to  make  an  estimate  of  the  duration  of  the 
‘ phenomenon.  By  definition,  a day  with  a phenomenon  Is  a day  on  which  the  phe- 

f nomenon  occurred.  The  phenomenon  may  have  occurred  on  only  one  observation  or 

I on  more  than  one,  but.  In  either  case.  It  Is  considered  as  only  one  day  with 

the  phenomenon.  Brooks  and  Carruthers  [11]  developed  the  following  method  for 
• providing  a reliable  estimate  of  the  mean  nxjmber  of  days  when  the  percentage 

frequency  (all  hours)  Is  known.  The  mean  number  of  days  of  the  phenomenon  In 
! a month  Is  given  by: 

; (89)  M - N (1  - q^) 

where  q^  Is  the  probability  of  having  no  occurrence  of  the  phenomenon  at  all 
‘ on  any  one  day,  and  N Is  the  number  of  days  In  the  particular  month.  The 

t problem  Is  to  express  q.  In  terms  of  the  all-hours  probability  p of  the 

i t 
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occurrence  of  the  phenomenon.  The  probability  equals  the  probability  of  no 
occurrence  at  the  first  hour  times  the  probability  of  no  Independent  occur- 
rence on  the  second  and  all  subsequent  observations  on  that  day.  The  proba- 
bility of  nonoccurrence  at  the  first  hour  Is  equal  to  the  all-hours  probability 
of  nonoccurrence  of  the  phenomenon  q.  The  probability  of  an  Independent  oc- 
currence of  the  phenomenon  at  the  second  and  subsequent  observation  hours  Is 
equal  to  p(l  - a),  where  1 - a Is  the  probability  that  the  phenomenon  will  not 
occur  on  any  two  successive  observations  and  a Is  given  by; 

(90)  a - 1 - 

D Is  the  mean  duration  of  the  phenomenon  In  hours.  Since  p(l  - a)  Is  equal  to 
the  probability  of  an  Independent  occurrence,  1 - p(l  - a)  =q  +ap  Is  the 
probability  of  no  Independent  occuirrence  on  any  of  the  23  observations  follow- 
ing the  first  hour.  Therefore,  the  probability  of  having  no  occurrence  at  all 
on  any  one  day  Is  given  by: 

(91)  = q(q  + ap)^^ 

Substitution  of  Equation  (90)  Into  Equation  (9l)  gives; 


^^t  = 


q + 


6 - 


= q l^q  + p 


and  since  p + q = 1 

(92)  q^  = q (l  - 

If  Equation  (92)  Is  substituted  Into  Equation  (fc9),  the  resulting  equation  Is 

(93)  M = N [l  - q (^1  - 

Equation  (93)  may  be  used  to  find  the  mean  number  of  days  per  month,  when  the 
percentage  frequency  (all  hours)  Is  known  and  an  estimate  of  the  mean  duration 
can  be  made.  Figure  l4  Is  the  graphical  solution  of  Equation  (93)j  the  mean 
number  of  days  can  be  determined  readily  from  It. 

d.  The  new  USAP  Revised  Uniform  Summary  of  Surface  Weather  Observations 
gives  the  percentage  frequency  (all  hours)  of  various  weather  phenomena  and 
the  percentage  of  days  with  the  same  phenomena.  These  data  were  used  with 
Equation  (93)  to  derive  the  mean  duration  of  thunderstorms,  rain  and/or 
drizzle,  snow  and/or  sleet,  and  fog  at  four  USAP  bases.  Observed  mean  dura- 
tions of  these  phenomena  for  four  airbases  were  calculated  from  actual  weather 
observations  and  compared  with  the  derived  values.  Table  23  shows  agreement 
to  be  quite  good.  Percentage  frequencies  (all  hours)  and  percentages  of  days 
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TABLE  23 

«i  Derived  and  Observed  Mean  Duration  of  Weather  Phenomena. 


Thunderstorms 

Snow  and/or  Sleet 

January  and  July 

January 

Station 

Derived 

Observed 

Derived 

Observed 

Bergstrom 

1.7 

1.7 

5.6 

Carswell 

1.5 

2.0 

3.8 

Lockboume 

1.8 

1.6 

5.9 

McGuire 

1.6 

1.7 

3.9 

mSm 

Rain  and/or 

Drizzle 

January 

July 

Bergstrom 

6.3 

4.2 

1.9 

1.9 

Carswell 

4.4 

3.4 

2.5 

1.9 

Lockboume 

6.7 

4.2 

2.2 

2.0 

McGuire 

6.2 

4.4 

3.0 

2.6 

Fog 

January 

July 

Bergstrom 

10.5 

8.1 

3.1 

1.9 

Carswell 

9.8 

8.9 

5.2 

3.0 

Lockboume 

9.2 

9.0 

4.0 

4.1 

McGuire 

9.8 

8.3 

5.2 

6.1 

of  these  phenomena  were  taken  from  the  Revised  Uniform  Summary  of  Surface 
Weather  Observations  for  63  USAF  bases  and  used  with  Equation  (93)  to  derive 
mean  durations  of  weather  phenomena  at  these  airbases.  Next,  averages  and 
standard  deviations  were  calculated;  they  are  given  In  Table  24,  These 
average  values  should  give  good  estimates  of  D,  the  mean  duration,  for  thun- 
derstorms, rain  and/or  drizzle,  snow  and/or  sleet,  and  fog. 

Example  43;  How  many  days  with  thunderstorms  can  be  expected  at  Miami, 
Florida  during  the  month  of  August,  If  the  percentage  frequency  (all 
hours)  Is  4^?  From  Table  24,  the  mean  duration,  D,  of  thunderstorms 
Is  estimated  to  be  1.6  hours.  Consequently,  for  a percentage  fre- 
quency of  4$^  and  D of  1.6  hours.  Figure  14  gives  15  days  with  thunder- 
storms, which  Is  very  close  to  the  observed  value  of  l6  days  at 
Miami. 
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i TABLE  24 

i 

' », 

k > Mean  Duration  of  Phenomena  in  Hours  at  USAF  Bases. 


Thunderstorms 

Middle  Latitude  and  Tropics 

Jan 

Apr 

Jul 

Oct 

Average  Mean  Duration 

1.7 

1.6 

1.6 

1.5 

Standard  Deviation 

1.2 

0.6 

0.4 

0.5 

No.  of  Bases  Reporting 

22 

45 

49 

50 

Rain  and/or  Drizzle 

Arctic  and  Middle  Latitude 

Jan 

Apr 

Jul 

Oct 

Average  Mean  Duration 

5.0 

4.4 

2.7 

4.2 

Standard  Deviation 

1.3 

2.3 

1.2 

1.1 

No.  of  Bases  Reporting 

55 

55 

54 

55 

Rain  and/or  Drizzle 

Tropics 

Jan 

Apr 

Jul 

Oct 

Average  Mean  Duration 

1.5 

1.3 

1.3 

1.3 

Standard  Deviation 

0.6 

0.5 

0.3 

0.3 

No.  of  Bases  Reporting 

7 

7 

7 

7 

Snow  and/or  Sleet 

Arctic  and  Middle  Latitude 

Jan 

Apr 

Jul 

Oct 

Average  Mean  Duration 

4.8 

3.9 

-- 

3.7 

Standard  Deviation 

1.4 

1.9 

— 

1.9 

No.  of  Bases  Reporting 

46 

29 

— 

18 

Fog 

Arctic,  Middle  Latitude  and  Tropics 

Jan 

Apr 

Jul 

Oct 

Average  Mean  Duration 

8.3 

5.3 

4.1 

5.4 

Standard  Deviation 

2.8 

2.0 

1.5 

2.1 

No.  of  Bases  Reporting 

59 

56 

53 

57 
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questions  concerning  engineering  design  require  knowledge  of  the  extreme  value 
to  be  expected  In  a stated  number  of  years.  Buildings  and  antennae  must  be 
able  to  withstand  the  strongest  wind,  roofs  must  withstand  the  greatest  snow 
load,  and  dams  must  withstand  the  maximum  flood  anticipated  In  the  lifetime  of 
the  structure.  In  such  problems,  specified  calculated  risks  may  be  taken  when 
the  likelihood  of  occurrence  of  these  extremes  can  be  estimated.  The  statis- 
tical theory  of  extreme  values  Is  the  method  of  estimating  the  extreme  values 
to  be  expected  In  a given  period,  the  lifetime  of  the  structure.  Different 
Investigators  advocate  using  slightly  different  theoretical  distributions  to 
compute  extremes  from  observed  data.  The  most  frequently  used  theoretical 
distribution  for  extreme  values  Is  the  double-exponential  distribution.  Grln- 
gorten  [26]  covers  the  task  of  selecting  an  Ideal  distribution  for  extreme 
values.  Court  [17]  demonstrates  that  the  distribution  of  5-rolnute  annual  ex- 
treme winds  Is  approximated  exceptionally  well  by  the  double-exponential  dis- 
tribution. Gumbel  [27]  presents  an  excellent  and  exhaustive  discussion  of 
extreme  value  theory.  Thom  [46]  discusses  applications  of  the  theory  to  mete- 
orological extremes. 

b.  The  theory  applies  to  the  largest  (or  smallest)  values  In  each  of  N 
Independent  sets  of  n Independent  observations  drawn  from  the  same  popu?  .tlon. 
This  parent  population  must  be  distributed  according  to  some  exponential  law 
so  that  It  is  unlimited  but  tends  to  zero  as  the  variable  Increases  or  de- 
creases. The  distribution  must  also  possess  all  moments.  The  fundamental 
theorem  of  the  theory  of  extreme  value  Is:  In  a set  of  N Independent  extremes 
Xj,  Xg,  x^,  ...,  Xjj,  each  being  the  extreme  of  n observations  of  an  unlimited, 
exponentially-distributed  variable,  as  both  N and  n grow  large,  the  cumulative 
probability  that  any  of  these  N extremes  will  be  less  than  any  chosen  quantity, 
X,  approaches  the  double-exponential  expression 


F(x)  = exp 


-a(x-x) 


where:  x = some  value  of  the  variable,  and  x = most  frequent  value  (mode)  of 
the  set  of  extremes. 

c.  The  two  values  a and  x are  estimated  by  the  theory  of  least  squares 
from  the  data  of  the  sample,  using  two  theoretical  quantities: 


f • I - 
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The  mean  Is  x and  the  standard  deviation  of  the  set  of  extremes  (sample)  Is 

s , while  the  mean  Y.,  and  standard  deviation  a„  of  the  theoretical  variate  de- 
x'  N N 

pend  only  on  the  sample  size  N.  Since  the  double-exponential  form  of  the 
basic  Equation  (94)  Imposes  difficulties  In  computation  and  analysis.  It  la 
reduced  to  linear  fonn  by  taking  the  double  logarithm  of  both  sides.  The  new 
variate,  y(x)  = -In  [-^n  F(x)],  Is  called  the  reduced  variate; 

y = a(x-x) 

Solved  for  x,  this  equation  becomes: 

x = x + ^ 
a 

After  substitution  of  Equations  (95)  and  (96),  this  expression  becomes 


This  equation  gives  the  expected  extreme  for  any  set  of  N extremes  for  speci- 
fied probability  of  nonoccurrence  given  by  y.  Equation  (97)  Is  the  basic 
equation  for  computing  various  expected  extremes  and  gives  the  "frequency 
factor"  for  the  theory  of  extreme  values: 


(98) 


K = 


(y  - Yj,) 


since  the  reduced  variate  y Is  the  double  logarithm  of  , the  probability  and  Yj^ 
and  Cjj  depend  only  on  the  sample  size,  K can  be  tabulated  for  use.  Table  25 
presents  values  of  K for  various  probabilities,  F(x),  and  various  sample 
sizes,  N.  If  additional  values  are  required  for  a specific  problem,  they  can 
be  easily  computed  from  Equation  (98)  and  the  reduced  variate  equation.  The 
"general  formula"  for  the  "line  of  expected  extremes"  Is 

(99)  X = X + K(s^) 

where  x Is  the  expected  extreme,  whose  probability  of  not  being  equaled  Is 
F(x)  = exp  (-e”Y)^  and  x and  s^  are  the  mean  and  standard  deviation  from  the 
available  sample. 

d.  The  term  "reture  period"  Is  used  frequently  In  extreme  value  analysis. 
By  definition: 

An  event  that  happens  A times  In  N trials  has  a relative  fre- 
quency of  occurrence  of  A/N  and  a return  period  of  RP  «=  N/A. 

The  return  period,  or  reciprocal  of  the  relative  frequency,  la  therefore  the 
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i TABLE  25 

f . 

1 

{ <1  Values  of  K for  Various  Probabilities  and  Sample  Sizes. 


Probability: 

F(x) 

.999 

.99 

.975 

.95 

.80 

Return  Period: 

1000 

100 

40 

20 

5 

y = - 

in  [-in 

F(x)]: 

6.90726 

4.60015 

3.67625 

2.97020 

1.49994 

N 

^N 

°N 

Values  of 

K 

15 

.5128 

1.0206 

6.265 

4.005 

3.100 

2.408 

0.967 

16 

.5157 

1.0316 

6.196 

3.959 

3.064 

2.379 

0.954 

17 

.5181 

1.0411 

6.137 

3.921 

3.033 

2.355 

0.943 

18 

.5202 

1.0493 

6.087 

3.888 

3.008 

2.335 

0.934 

19 

.5220 

1.0566 

6.043 

3.860 

2.985 

2.317 

0.926 

20 

.5236 

1.0628 

6.006 

3.836 

2.966 

2.302 

0.919 

21 

.5252 

1.0696 

5.967 

3.810 

2.946 

2.286 

0.911 

22 

.5268 

1.0754 

5.933 

3.788 

2.929 

2.272 

9.905 

23 

.5283 

1.0811 

5.900 

3.766 

2.912 

2.259 

0.899 

24 

.5296 

1.0864 

5.870 

3.747 

2.896 

2.247 

0.893 

25 

.5309 

1.0914 

5.842 

3.728 

2.882 

2.235 

0.888 

26 

.5816 

1.0961 

5.816 

3.711 

2.869 

2.224 

0.883 

27 

.5332 

1.1004 

5.792 

3.696 

2.856 

2.215 

0.879 

28 

.5343 

1 . 1047 

5.769 

3.681 

2.844 

2.205 

0.874 

29 

.5353 

1.1086 

5.748 

3.667 

2.833 

2.196 

0.870 

30 

.5362 

1.1124 

5.727 

3.653 

2.823 

2.188 

0.866 

40 

.5436 

1.1413 

5.576 

3.554 

2.745 

2.126 

0.838 

50 

.5485 

1.1607 

5.478 

3.491 

2.695 

2.086 

0.820 

60 

.5521 

1.1747 

5.410 

3.446 

2.660 

2.058 

0.807 

70 

.5548 

1.1854 

5.359 

3.413 

2.633 

2.038 

0.797 

80 

.5569 

1.1938 

5.319 

3.387 

2.613 

2.022 

0.790 

90 

.5586 

1 . 2007 

5.287 

3.366 

2.597 

2.009 

0.784 

100 

.5600 

1 . 2065 

5.261 

3.349 

2.583 

1.998 

0.779 

1000 

.5745 

1 . 2685 

4.992 

3.174 

2.445 

1.889 

0.730 

OO 

.5772 

1.2826 

4.936 

3.137 

2.416 

1.866 

0.719 

AVERAGE  Interval  between  recurrences  of  the  event  In  a particular  series  of 
trials.  Expressed  In  terms  of  probability,  the  return  per’od  Is 

(100)  bp-t-tVi 

where  P{x)  has  been  previously  defined  (page  4-53)  and  1 - F(x)  Is  the  rela- 
tive frequency,  f(x),  of  the  event.  The  return  period  Is  often  misunderstood 
and  should  not  be  used  except  as  a tool  In  computing  calculated  risks.  Cal- 
culated risk  Is  defined  as  the  probability  of  at  least  one  occurrence  of  an 
event  during  a specified  time  Interval.  Given  the  variable  x,  a calculated 
risk  of  some  value  of  x can  be  stated  for  one  year,  five  years,  ten  years,  or 
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longer.  Consider  the  binomial  theorem  (page  3-l6,  Chapter  3).  If  n is  any 
positive  integer  and  p and  q are  any  numbers,  then 

n 

^ V Ai'N  „n-r  r 

(p  q)  =2^  p q 

r=0 

If  p is  the  probability  of  some  event  and  q,  its  complement,  and  the  only  pos- 
sible events  are  p and  q,  then  the  probability  that  p will  occur  in  all  n 
trials  is 

n 

V fn\  n-0  n 

Vo;  p p = p 

r=0 

It  was  seen  earlier  that  F(x)  is  the  probability  that  the  event  x will  not  oc- 
cur. The  probability  that  x will  not  occur  in  n trials  is  [f(x)]P.  Hence, 
the  probability  of  at  least  one  occurrence  (calculated  risk)  of  the  event  x is. 

(101)  f(x)^  = 1 - [F(x)]" 

To  express  this  in  terms  of  the  return  period,  it  is  only  necessary  to  re- 
arrange Equation  (lOO)  to  give 

(102)  F(x)  = 

Substitution  into  Equation  (lOl)  yields 

(103)  . 1 . 

If  f(x)ri  is  fixed  in  advance,  then  HP  may  be  found  by  conversion  of  Equation 

(103)  to 


(104) 


RP 


1 

(1  - [1  - f(x)„]^/"} 


Solution  of  Equation  (l04)  gives  necessary  return  periods  for  various  values 
of  n and  f(x)j^.  Table  26  gives  these  values  for  several  combinations  of 
f(x)^  and  n. 

e.  Figure  I5  illustrates  the  special  "extreme  probability  paper"  pre- 
pared by  Oumbel  [27]  and  modified  by  Court  [17]»  In  which  the  cumulative  prob- 
ability, F(x),  is  plotted  on  the  left  ordinate  and  the  variate,  x,  on  the 
abscissa.  The  right  ordinate  is  a quasi  logarithmic  scale  for  the  return 
period.  On  this  paper  a double-exponential  distribution  appears  as  a straight 
line.  Extreme  probability  paper  is  identical  in  function  and  use  to  other 
probability  papers;  observations  are  plotted  on  it  by  rank  and  magnitude. 
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Each  extreme  Is  plotted  at  an  abscissa  corresponding  to  Its  value  and  at  an 
ordinate,  on  the  double  logarithmic  scale,  corresponding  to  Its  cumulative 
rank  divided  by  N +1.  The  abscissa  for  the  12th  largest  extreme  of  24  ex- 
treme values  Is  13/25  or  0.52.  These  points  should  approximate  a straight 
line  If  the  set  of  extremes  follows  the  theory  of  extreme  values. 

Example  44;  A structure  Is  to  be  built  at  Argentla  NAS,  Newfoundland. 
The  structure  Is  planned  for  a useful  life  of  10  years.  A calculated 
risk  of  105^  that  the  design  wind  speed  will  be  equaled,  or  exceeded, 
during  the  10  years  (design  life)  Is  specified.  What  Is  the  wind  speed 
that  meets  these  criteria? 

Solution;  From  climatic  data  shown  In  Table  27,  the  following  annual 
maximum  winds  are  available  for  Argentla  NAS  (values  are  for  30  feet 
above  the  ground ) : 


TABLE  27 


Peak  Gust  Wind  Data  for  Argentla  NAS. 


Year 

Peak 
Gust 
(kts ) 

Year 



— 

Peak 
Gust 
(kts ) 

Year 

Peak 
Gust 
(kts ) 

1941 

73 

72 

1957 

78 

1942 

76 

lgl 

77 

1958 

74 

1943 

73 

IQ 

79 

1959 

90 

1944 

72 

80 

i960 

87 

1945 

71 

Hg 

74 

1961 

69 

1946 

77 

gg 

77 

1962 

67 

1947 

62 

mm 

91 

1963 

72 

1948 

75 

83 

-- 

-- 

Compute  the  mean,  x,  and  the  standard  deviation,  s^,  of  the  above  peak 
gust  observations  (N  = 23): 

X = ^ = = 76.04  knots 

3^2  ^ „ 134,049^-  132,994  ^ 47  95 


or 


Sjj  =6.92  knots 
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The  estimate  of  the  extreme  value,  x,  for  a given  probability  of  being 
equaled  or  exceeded  during  any  one  year  Is  given  by  x = x + s^K,  where 
X and  s^  are  the  mean  and  standard  deviation  of  the  available  sample. 

K Is  the  frequency  factor;  It  varies  with  the  probability  level  and 
sample  size.  From  Table  25,  select  at  least  three  probability  levels 
to  compute  the  "line  of  expected  extremes." 

Probability  Level  K (N  » 23) 


.999 

5.900 

.99 

3.766 

.95 

2.259 

.80 

0.899 

Prom  the  above  equation 

999  = ^ + Sx  (5.900)  = 76.04  + 6.92  (5.900)  = 116.9  knots 

X gg  = 76.04  + 6.92  (3.766)  = 102.1  knots 

X „ = 76.04  + 6.92  (2.259)  - 91.7  knots 

. 95 

X qq  - 76.04  + 6.92  (0.899)  - 82.3  knots 

Figure  15  Is  the  plot  of  the  "line  of  expected  extremes,"  The  straight 

line  on  the  extreme  probability  paper  In  Figure  I6  gives  the  calculated 
distribution  of  annual  extremes  of  the  peak  gust  wind  speed.  Pirom  this 
curve,  read  the  calculated  risk  for  any  wind  speed  during  any  one  year. 
Determine  the  wind  speed  for  a calculated  risk  of  lOjt  during  10  years. 

The  right  ordinate  of  the  extreme  probability  paper  Is  expressed  In 
return  period.  Notice  that  the  risk  (.99  probability  of  nonoccurence) 
value  Is  equivalent  to  a 100-year  return  period.  Prom  Table  26,  which 
gives  return  periods  for  various  calculated  risks  during  specified  num- 
ber of  years,  select  the  necessary  return  period  for  10J(  risk  during 
10  years.  This  value  Is  95  years.  The  wind  speed  value  corresponding 
to  a 95- year  return  period  Is  102  knots.  This  value  (102  knots)  has  a 
calculated  risk  of  105^  of  being  equaled,  or  exceeded,  at  least  once 
during  any  10-year  period.  This  Is  the  required  design  wind.  The  as- 
sumption was  made  that  the  annual  peak  wind  gusts  are  double-exponen- 
tlally  distributed.  To  determine  how  effectively  the  annual  peak  gusts 
fit  the  assijmed  distribution,  plot  the  observed  values  using  the  Oumbel 
plotting  rule,  that  Is,  the  observations  are  arranged  In  ascending 
order  and  each  cumulative  rank  (lowest  to  highest)  Is  divided  by  N + 1 
to  give  the  c^anulatlve  probability  of  nonoccurrence.  These  points  are 
also  shown  In  Figure  I6.  The  plot  of  these  data  on  extreme  probability 
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paper  (Figure  I6)  Indicates  that  these  data  approximate  a straight 
line,  the  fit  is  excellent,  and  the  assumption  of  a double-exponential 
distribution  is  adequate. 


November  1968 


AWSP  105-2 


Chapter  5 

WIND  DISTRIBUTIONS 


1 . Introduction. 

One  of  the  main  problems  of  the  applied  climatologist  has  been,  and  con- 
tinues to  be,  the  determination  of  wind  distribution  (speed  and  direction) 
through  a given  layer  or  at  a given  pressure  surface.  To  provide  adequate 
service  In  this  area,  the  climatologist  must  be  familiar  with  certain  standard 
problem-solving  techniques.  This  chapter  deals  with  the  statistical  model  of 
the  wind  distribution  at  a given  level  or  pressure  surface;  this  model  Is 
known  generally  as  the  elliptical  normal  distribution.  A special  case  of  this 
distribution  Is  referred  to  as  the  circular  normal  distribution.  One  facet  of 
the  wind  distribution  problem  Is  the  ballistic  wind,  defined  as  a fictitious, 
uniform  wind  extending  from  the  ground  to  bombing  altitude,  and  which  Is  de- 
termined In  such  a way  that  Its  effect  on  the  bomb  Is  the  same  as  the  variable 
winds  actually  encountered.  The  vector  correlations  between  winds  at  two 
levels  are  the  links  between  the  distribution  of  the  ballistic  wind  and  the 
distribution  of  wind  at  a number  of  levels  In  the  upper  air. 


Circular  Normal  Wind  Distribution. 

a.  The  frequency  distribution  known  as  the  circular  normal  distribution 
has  been  used  frequently  to  represent  the  climatological  distribution  of  wind 


vectors  In  the  upper  air  over  a given 
location  or  area.  It  requires  that 
wind  observations  form  a homogeneous 
set  or.  In  other  words,  are  drawn  from 
the  same  population  [11].  Figure  17 
illustrates  the  variables  that  are  In- 
volved In  the  development  of  the  cir- 
cular nonnal  distribution.  Vj^  Is  the 
mean  resultant  vector  of  the  sample, 

V Is  any  wind  observation  In  the  sam- 
ple, V Is  the  vector  difference  be- 
tween Vjj  and  V,  9 Is  the  angle  between 

V and  Vp,  and  x and  y are  the  compo- 
nents of  V along  and  at  right  angles 
to  Vp,  respectively. 


Figure  17.  Variables  In  Circu- 
lar Normal  Distribu- 
tion. 
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b.  The  circular  normal  distribution  requires  that  the  following  three 
conditions  exist:  (l)  x and  y are  normally  distributed,  (2)  the  variances  of 
X and  y are  equal,  and  (3)  x and  y are  statistically  Independent.  These  con- 
ditions determine  the  form  of  the  Joint  frequency  distribution  of  x and  y.  It 
Is  the  product  of  two  normal  distributions. 


(105)  P(x,y)  =^^-5-  If 


dx  dy 


X y ^ 


X -y 


This  equation  can  be  transformed  into  a polar  coordinate  form  with  v and  0 as 
the  variables.  Prom  the  stipulated  conditions  and  Figure  17,  the  following 
relations  hold 

2 2 2 
V = X + y^ 


V Cos  0,  y = V Sin  0 


and 


-T-x,  \ Cos  0 am  « 

= J (0)  = -V  sin  0 V Cos  0 dv  d0  = V dv  d0 


Sin  0 


Therefore,  the  Joint  distribution  of  v and  0 is 

.2 


V 0 


(v,e)  . f f 


ve 


dv  d0 


V 0 0 


where  Is  the  vector  standard  deviation.  If  we  Integrate  this  over  0 from 
2ir,  the  distribution  of  v is 


(106)  P{v,0)  = 


V 

/ 


ve 


dv 


1 - e 


V 0 


Letting  V = kc 


PCkOy) 


1 - e 


- k‘ 


This  distribution  can  be  represented  as  a family  of  concentric  circles  centered 
at  the  end  of  the  vector  Vj^  and  with  the  radius  ko^  (Figure  I8).  There  Is  a 
frequency  for  each  circle;  It  Is  the  frequency  that  the  ends  of  the  wind  vec- 
tors will  be  within  the  circle  of  radius  ko^.  The  percent  p)?obabllltles  for 


5-2 


November  1968 


AWHf  105-2 


various  values 

of  the 

multiplier  k. 

where 

= [-  £n  (1  - 

P)]^, 

are ; 

k = 

.32 

.*»7 

.60 

.71 

.83 

.96 

1.10  1.27 

1.52 

2.15 

P(5?)  = 

10 

20 

30 

40 

50 

60 

70  80 

90 

99 

Figure  I8.  Circular  Normal  Distribu- 
tion Probabilities. 


3.  Test  for  Circularity. 

a.  In  order  for  the  analyst  to  know  when  to  accept  a distribution  as  cir- 
cular, he  needs  a method  to  test  the  significance  of  the  ellipticity  of  the 
distribution.  Mauchly  [36]  defines  an  ellipticity  statistic,  L^,  by; 


(107) 


2 0 0 a/i  - r ^ 
X y 'V  xy 

* »/) 


In  Equation  (107),  o and  o are  the  standard  deviations  of  the  distribution 

A y 

of  the  individual  zonal  and  meridional  components,  respectively,  and  r Is 

xy 

the  correlation  coefficient  of  these  zonal  and  meridional  components.  For  a 

perfect  circular  distribution,  o = o , r = 0,  and  L =1;  otherwise,  L is 

X y 5cy  © © 

less  than  1. 

b.  The  probability  of  obtaining  a value  of  as  small  as  the  value  found 
by  using  Equation  (107)  in  a sample  of  N Independent  observations  drawn  from  a 
population  in  which  = 1 is  shown  by  Mauchly  to  be  If  the  55?  level 

of  significance  is  adopted.  Brooks  and  Carruthers  [11]  state  that  the  analyst 
should  accept  the  distribution  as  circular,  provided  that 
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4 , Elliptical  Normal  Wind  Distribution. 

a.  The  circular  normal  distribution  model  has  proved  extremely  useful  In 
a variety  of  applications  as  an  approximation  of  the  true  distribution.  How- 
ever, many  of  the  actual  distributions  of  the  upper  winds  do  not  satisfy  the 


criteria  of  equal  variance  for  the 
components  of  v or  that  of  statisti- 
cal Independence  of  the  components. 

If  the  distribution  does  not  meet  the 
criteria  for  circularity,  It  Is  pos- 
sible to  represent  the  upper  winds 
with  the  basic  noncircular  (ellipti- 
cal) distribution  (Figure  19).  This 
Is  the  general  form  of  the  bivariate 
normal  distribution;  it  requires 
three  parameters  (the  variances  of 
the  two  components  and  the  correla- 
tion coefficient)  '.n  place  of  the 
standard  vector  deviation  or  Its 
square  (vector  variance).  This 
Joint  distribution  of  the  x-  and 
y-components  Is: 


Figure  19.  Elliptical  Normal  Dis- 
tribution Probabilities. 


(108)  P(x,y)  = 


2Tr 


c a 
X y 


(1 


If 

-X  -y 


2(l-r2) 


^2d_  _ 

\ 2 o a ^ ^ 2j 


X y 


dx  dy 


For  this  model  of  the  upper  wind,  the  projection  of  the  frequency  surface  onto 
the  xy-plane  will  be  represented  by  a family  of  ellipses. 

b.  The  application  of  this  elliptical  normal  distribution  to  problems  in- 
volving wind  vectors  Is  very  difficult  when  the  distribution  Is  expressed  In 
the  form  of  Equation  (108).  However,  difficulties  can  be  avoided  by  changing 
the  form  of  the  equation  to  one  that  permits  the  analyst  to  compute  readily 
the  probability  of  a wind  vector  being  In  a given  ellipse. 

c.  To  develop  the  equation  for  the  probability  ellipse,  let  x and  y be 
the  components  of  the  departure  from  the  mean  resultant  vector  (Vj^)  In  the 
conventional  axes,  east  and  north,  respectively,  and  let  x'  and  y'  be  the  cor- 
responding departures  referred  to  the  axes  rotated  counterclockwise  through  9. 
The  origin  Is  at  the  extremity  of  the  mean  resultant  wind,  x'  = x Cos  0 + y 
Sin  9,  and  y'  = y Cos  9 - x Sin  9. 

(l)  From  the  preceding  equations,  Scott  [43]  deduces  that 
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- a ^ Cos^  0 + a ^ Sln^  0 + 2ra  a , Sin  0 Cos  0 
X ^ y X y 

a ? - o ^ Cos^  0 + a ^ Sln^  0 - 2ra  a„  Sin  0 Cos  0 

y y * * y 

where  r and  r'  are  the  congelation  coefficients  of  the  non-prlmed  and  primed 

components,  respectively.  The  a , a and  r are  reported  In  standard  upper- 

X y 

wind  summaries.  With  these  values  known.  It  Is  possible  to  select  an  angle  so 
that  r*  «>  0 and  solve  for  o^,  and  o^,.  Letting  r'  equal  to  zero 

0 = 2ra  CT  Cos  0 + (o  ^ - a^)  Sin  20 

X y X y 

or 

Tan  20  = 

and 


(110) 


9 = % ARCTAN 


Equation  (llO)  gives  the  rotation  angle  0 for  substitution  Into  Equation  (109) 
to  detennlne  o^,  and  *^yt»  standard  deviations  of  wind  components  along  the 
major  and  minor  axes  of  the  distribution,  respectively.  These  two  standard 


deviations  ( a , and  a , ) are  given  as  a 
X y ci 

summaries  and  henceforth  In  this  chapter. 


and  In  the  standard  upper-wind 

a becomes  a and  a , becomes  o.  . 
X ay  D 


(2)  The  following  method  provides  an  easier  means  for  computing  o 

Ql 

and  ; 


The  expansion  of  this  determinant  Is  a quadratic  equation; 
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axis  of  the  distribution  and  the  smaller  value  Is  the  variance  of  the 


wind  components  along  the  minor  axis.  Therefore,  the  positive  square  roots  of 
2 2 

and  are  the  desired  standard  deviations,  and  o^.  With  r'  = 0,  the 
x*  and  y'  components  can  be  treated  as  Independent,  and  the  wind  distribution 


can  be  expressed  as  the  Joint  distribution  of  x'  eind  y'. 


a b 


If 


dx'  xy' 


-a  -b 


where  + y'^A^  = 1 and  aA  = I<et  u = aA„  and  v = bA^.  then 


a/o  Vs 

/ 7^ 

-aAa  -Vs 


P(u.v).^  / 7'  “ ^ ax. 


Now  transforming  to  polar  coordinates,  let  = u^  + v^,  u = z Cos  0,  and 


V = z Sin  0.  Then 


P(z,0)  = ^ J f z e ■ ^ dz  d0 


0 0 

Integration  of  the  above  equation,  when 

.2  ^.2 


-0 


^a 


Is  the  probability  that  the  wind  vector  ends  within  the  ellipse  defined  by  the 
parenthetical  equation,  gives: 


(112)  p(^.i^.l). 


2a 


1 - e 


Therefore,  the  major  axis  of  the  ellipse,  a.  Is  expressed  In  multiples  of 
S “ S ‘^a^’  2®J^cent  probabilities  for  various  values  of  the  multiplier, 
k,  where  k » [2  tn  [l/(l  - P)])^  are 


•'e 

= .46 

.67 

.84 

1.01 

1.18 

1.35 

1.55  1.79  2.15 

3.03 

?{%) 

= 10 

20 

30 
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50 

60 
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99 

So  a - k_ 

e a 

and  b « 

k^  o.  are 
e b 

used 

as  the 

major 

and  minor  axes  of 

ellipses 
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that  contain  the  specified  percent  of  the  distribution.  For  example,  60)t  of 
the  wind  vectors  end  within  the  ellipse  with  major  axis  a = 1.35  o and  with 

a 

minor  axis  b = 1.35 
5.  Correlation. 

a.  The  models  of  the  wind  distribution  at  a level  or  on  a pressure  sur- 
face can  be  extended  to  encompass  the  distribution  of  ballistic  wind.  This  Is 
feasible  since  the  ballistic  wind  Is  a linear  combination  of  winds  at  a number 
of  levels.  However,  It  Is  necessary  to  stop  and  discuss  the  correlation  of 
vectors  before  making  this  step. 

b.  Unfortunately,  approaches  to  the  correlation  of  vectors  are  not  as 
well  established  as  the  subject  of  correlation  of  scalar  quantities.  A number 
of  techniques  of  vector  correlations  have  been  proposed;  their  answei*s  differ 
and  there  Is  not  complete  agreement  as  to  which  Is  the  best  approach.  Two  ap- 
proaches are  developed  here,  one  proposed  by  Court  [17]  and  the  other  by 
Durst  [23]. 

c.  Concepts  and  equations  of  linear  correlation  are  used  In  the  develop- 
ment of  vector  correlation.  For  convenience  of  reference,  some  of  the  neces- 
sary equations  will  be  listed  without  derivation.  For  simplicity,  but  with  no 
loss  of  generalization,  variables  are  treated  as  departures  from  the  mean  and 
the  relation  between  v and  x Is  expressed  as 

V + q = m(v/x)  = ax 

where  v is  the  dependent  variable,  x the  Independent  variable,  m(v/x)  Is  the 
empirically  derived  function  for  estimating  v given  x,  q Is  the  eiTor  of  esti- 
mate, and  a Is  the  regression  coefficient.  The  standard  error  of  estimate 
^®v/x^  Is  obtained  by 

(113)  ^ X ‘w  r 

where  N Is  the  number  of  cases.  The  correlation  coefficient  for  the 

linear  relation  between  v and  x Is  given  by 


where  s^  Is  the  standard  deviation  of  v.  The  equation  for  linear  multivariate 
correlation  Is  analogous  to  that  for  the  bivariate  correlation.  The  relation 
between  v and  the  Independent  variables  x and  y Is  expressed  as 


(115)  J I (v  - X - ag  y)2  = 

The  multiple  correlation  coefficient  Is  given  by 


(116) 


1 - (v,,)^ 


The  multiple  correlation  coefficient  can  also  be  expressed  In  terms  of  the 
correlation  coefficients  for  the  various  possible  pairs  of  variables: 

d.  In  his  approach  to  vector  correlation.  Court  [l8]  begins  by  assuming 
that  the  relation  between  two  vectors  can  be  expressed  In  a form  analogous  to 
the  linear  scalar  correlations.  Asstnne  that  the  vectors  are  departures  from 
the  mean  resultant  vector.  Thus,  the  relation  Is  assumed  to  be  where 

Is  the  dependent  vector,  ? the  Independent,  and  B the  regression  coefficient. 
vJ  and  ? are  vectors  at  two  points  In  time  or  space  (see  Figure  20).  The 
vector  error  of  estimate,  ? Is  defined  by 

(Il8)  vJ  + $ - B? 

(l)  The  vector  Equation  (ll8) 
can  be  written  out  In  full,  asstmilng 
for  convenience  that  each  vector  has 
only  two  components: 

[:]  * [y  - a t:] 

This  matrix  equation  can  be  expressed 
as  a set  of  linear  equations 


^<^vx)  <^vy)  ~ ^ 


(119) 


u + q^ 

V + Qv 


bi  X + Cl  y 
bg  X + Cg  y 


Figure  20.  Relationships  between 
Court's  Wind  Vectors. 
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Thus,  each  component  of  vJ  can  be  estimated  by  multiple  regression  on  the  com- 
ponents of  the  Independent  variable 

(2)  The  vector  standard  error  of  estimate  la  defined  by 

(120)  (s5^)2  » ^ X “ K I '^0  “ ^Vxy^^  (Vxy)^ 

(3)  The  next  step  la  to  define  the  vector  correlation  coefficient 
and,  from  the  definition,  derive  an  expression  for  Its  value  In  terms  of  the 
correlations  between  the  components  of  iJ  and 


“u  + % 


Using  Equation  (II6), 


, V - (Vxy)  1 ^ ^ - ^Vxy)  1 

S + S 
U V 

-u'  - °u' 

2 - _ ^ 


\ * \ 

®u  ®v 

From  Equation  (117)  it  Is  possible  to  re-wrlte  the  multiple  correlation  coef- 
ficients to  express  a vector  correlation  coefficient  In  terns  of  the  standard 
deviations  of  the  components  of  iJ  and  the  correlation  coefficients  of  pairs  on 
components  of  the  two  wind  vectors. 

(122)  — 5 ^ '^y 

SvlMl  - (r,y)^] 

^v  ^^^vx^  ^^vy^  ~ ^ ^vx  ^vy  ^xy^ 

Assuming  that  both  vJ  and  ? conform  to  a circular  normal  distribution.  Equa- 
tion (122)  becomes 


(123)  (r^^) 


2 t(^ux)  t(r^^)-^  (r^)^] 
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The  first  terr  on  the  right  side  measures  the  tendency  for  the  two  vectors  to 
be  parallel  and  the  second  term  measures  the  rotation  of  one  vector  from  the 
other. 

e.  In  his  approach  to  vector  correlation.  Durst  [23]  deals  with  vectors 
that  are  departures  of  the  wind  vector  from  the  mean  resultant  wind  and  uses 
and  ? as  noted  above.  The  vector  correlation  coefficient  is  composed  of  two 
parts,  one  measuring  the  parallelism  between  the  two  vectors  and  the  other 
measuring  the  amount  of  turn.  The  analyst  can  conceive  of  a relation  between 
^ and  ? In  which  tends  to  be  parallel  to  ? and  to  be  piroportlonal  to  It  In 
magnitude.  The  closeness  of  such  a relation  can  be  measured  by  a form  of  cor- 
relation coefficient  that  Is  called  the  "stretch  correlation  coefficient"  and 
Is  expressed  as 


(124) 


S 


other  Complicated  relations  can  be  considered,  particularly  the  situation  In 
which  vf  tends  to  have  a direction  rotated  by  an  angle  a from  The  angle  a 
Is  called  the  angle  of  turn  and  the  part  of  the  correlation  coefficient  that 
measures  the  effect  of  turn  Is 


(125) 


2 \jlid  I 


The  total  correlation  coefficient  Is  composed  of  both  of  these  (stretch  and 
turn)  relationships: 


(126) 


P = Pg  + Pt 


The  best  value  of  a can  be  expressed  as 


(127) 


^-1  2 I I 


To  evaluate  the  stretch  and  turn  parts  of  the  correlation  coefficient,  wind 
vectors  can  be  expressed  In  terms  of  their  components  and  dot  and  cross  opera- 
tions can  be  performed  on  these  component  vectors.  Component  axes  are  the 
same  as  those  used  In  the  previous  sections  and  are  shown  In  Figure  20.  Sub- 
stituting the  components  Into  Equation  (124), 


2(u  + v)« (x  + 


2(ux  + v\ 


» [2(^  + Z(5[  + [2(u2  + v^)  Z(x2  +^)]’ 
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(128) 


s s r + s 3 r 
u X ux  V y vj 


At  this  polntj  assume  that  the  distribution  of  wind  vectors  has  a circular 
normal  distribution  from  which  It  follows  that 

2 u^  = 2 v^ 

2 x^  = 2 

Using  these  equalities  and  separating  the  term  on  the  right  Into  two  terms. 


(129)  Pg  - )^  [■ 


(2u^  2x^)^  {2v^  2y^)’ 


The  two  terms  within  the  brackets  are  the  correlation  between  u and  x,  and 
between  v and  y,  respectively.  From  this.  It  Is  evident  that  the  stretch  vec- 
tor correlation  coefficient  Is  the  average  of  the  correlation  between  both 
pairs  of  components 

(130)  (r^^  + r^) 

The  procedure  for  evaluating  the  turn  correlation  coefficient  Is  similar  to 
that  used  with  the  stretch  correlation.  Substituting  the  components  Into 
Equation  (125), 


2(u  + v)  X (x  + 


2(vx  - us 


[2(u  + v)'^  2(x  + y)*^]^  [2(u‘'  + v^)  2(x^  + y^)]^ 

s s r - s s r 

(131)  Pt  = I-,  3^^  y 

Again,  assuming  a circular  normal  distribution  and  separating  the  term  on  the 
right  Into  two  terms. 


(132)  Pt  = [ 


2 vx  2 uy 

(2v^  2x^)^  ' (2u^  Sy2) 


5^] 


where  terms  In  the  brackets  are  the  correlation  coefficients.  Thus, 

(133)  Pt  - (r^^  - r^y) 

Substituting  Equations  (130)  and  (133)  Into  Equation  (126),  the  vector  corre- 
lation Is 

r + r r - r 
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which,  assuming  the  circular  normal  distribution.  Is  a simplified  equation 
for  p.  However,  the  substitution  of  Equations  (128)  and  (13I)  Into  Equation 
(126)  gives  the  value  of  p without  the  restriction  as  to  type  of  distribution; 


(134) 


ssr  +ssr  ssr  -ssr 
u X ux  V y vy  ^ v x vx  u y uy 


The  Inclusion  of  the  turn  portion  of  the  correlation  adds  very  little  to  the 
numerical  value  of  the  vector  correlation  In  most  problems  [23].  It  usually 
Is  omitted  and  only  the  stretch  vector  correlation  coefficient  used. 

f.  The  two  equations  for  the  computation  of  vector  correlation  developed 
in  previous  sections  are  considerably  different  from  each  other  even  though 
both  contain  a stretch  term  and  a turn  term.  Also,  one  Is  not  derivable  from 
the  other.  For  comparison,  the  two  equations  are  repeated; 


Court : 


s ^ [ (r  )^  + (r  )^  - 2 r r r ] 
^ ux'  ^ uy^  ux  uy  xy* 

^ Ti  ~ >2n 


(r^y)-] 


3 ^ [(r  )^  + (r  )^  — 2 r r r ] 
V ^ vx'  ^ vy'  ^vx  vy  xy^ 

.2  r 1 ~ \2i 


(see  page  5-9) 


Durst; 


P = 


ssr  + s s r 
u X ux  V y vy 


ssr 
V X vx 


ssr 
u y uy 


The  questions  of  which  Is  the  better  measure  of  correlation  and  whether  other 
methods  should  be  considered  cannot  be  answered  until  there  is  considerably 
more  testing  and  research.  However,  the  question  of  how  well  these  two  coef- 
ficients agree  has  been  Investigated.  Charles  [l4]  has  compared  both  correla- 
tion coefficients  temporally  and  spatially;  Scott  [44]  has  compared  the  two 
over  time  Intervals.  Charles  used  serially  complete  winds  at  500-,  300-,  and 
100-mb  surfaces  for  a five-year  period  at  50  stations;  Scott  used  only  one 
station.  For  the  time-lag  vector  correlation  coefficients,  Is  slightly 

larger  than  p but  there  Is  no  appreciable  gain  In  using  the  computationally 
cumbersome  In  place  of  p.  When  these  two  correlation  coefficients  are 

computed  for  two  sets  of  spatially-separated  winds,  the  agreement  is  good  when 
p Is  greater  than  0.3;  but,  when  p Is  less  than  0.3,  there  la  a major  differ- 
ence between  the  two  coefficients.  Of  course,  part  of  the  trouble  Is  that 

Is  defined  as  a positive  number  and  p can  be  either  positive  or  negative. 
There  are  two  reasons  for  using  the  p correlation  coefficient  and  both  concern 


1 
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the  ease  of  computation  and  manipulation  rather  than  rigorous  mathematics. 

The  first  reason  Is  that  p Is  much  easier  to  compute,  and  statistical  param- 
etera  required  for  Its  computation  are  more  readily  available,  especially  If 
one  uses  the  stretch  vector  correlation  coefficient  as  a close  approximation. 
The  second  reason  Is  that  the  stretch  vector  correlation  coefficient  acts  In  a 
manner  similar  to  the  scalar  con?elatlon  coefficient,  since  It  may  be  extended 
to  the  concept  of  total  and  partial  vector  correlation  coefficients  [23]. 


6.  Ballistic  Wind  Distribution. 


a.  A ballistic  wind  Is  defined  as  a fictitious,  single  wind  that  Is  rep- 
resentative of  the  layer  from  the  ground  to  bombing  altitude  and  Is  determined 
In  such  a way  that  Its  effect  on  the  bomb  Is  the  same  as  the  variable  winds 
actually  encountered.  Mathematically,  It  is  defined  as 


bomb  alt. 

= y'  w(Z)  ^(Z)  dZ 

sfc 


where 


bomb  alt. 


/ 


w(Z)  dZ  = 1 


sfc 


w(Z)  Is  the  ballistic  weighting  as  a function  of  altitude,  and  ^(Z)  Is  the 
wind  profile  as  a function  of  altitude.  Of  course,  the  analyst  does  not  usu- 
ally know  or  cannot  express  the  wind  as  a continuous  function  of  altitude  and 
must  adopt  a summation  process  In  place  of  an  Integral.  Therefore,  the  bal- 
listic wind  is  defined  as: 

Jc  k 


(135) 


= I ”1  ^1 


and 


1=1 


I”.- 


1=1 


where  k Is  the  number  of  zones  chosen  to  represent  the  total  bombing  zone,  w. 


Is  the  ballistic  weight  for  zone  1,  and  Is  the  wind  vector  for  zone  1.  The 
mean  resultant  ballistic  wind  Is  the  weighted  sum  of  the  mean  resultant  winds 
for  the  k levels. 


(136)  V, 


= 1”. 


1=1 


The  bar  Indicates  a mean.  The  vector  variance  can  be  computed  from  the  vari- 
ance of  each  of  the  components  of  the  ballistic  wind.  The  same  notation  that 
was  used  In  discussing  the  wind  distribution  at  a level  Is  used  here,  except 
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that  subscripts  are  used  to  denote  a ballistic  wind  or  component  and  to  Indi- 
cate levels  used  In  computing  a ballistic  wind.  The  departure  of  a ballistic 
wind  from  a mean  resultant  ballistic  wind  Is 


^b  “ ^b  ~ ^b  “ ^b  \ 
from  this  It  Is  possible  to  derive 

p p * p 

V • *1,  + V 


(137) 

where  s^  Is  the  vector  standard  deviation  of  the  ballistic  wind  and  the  two 
terms  on  the  right  are  the  variances  of  x and  y components  of  the  ballistic 
wind,  respectively. 

b.  The  components  of  the  ballistic  wind  are  defined  In  the  same  manner  as 
the  ballistic  wind  and 


K. 

(138)  “ X 


The  variance  of  is 


(139)  4-1  I 


where  S is  used  to  indicate  the  summation  over  N ballistic  winds  In  the  sample, 
Substituting  Equation  (138)  Into  Equation  (139)#  we  have 

N _ k 


w Z [ I ”i 


N k k 


k k N 


I I r ’'l^J  = I I r W («1«J  ^l^j) 

° 1=1  J=1  1=1  J=1 


(''"»  - Z f "i"j  -X., 


1-1  J-1 
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where  r is  the  correlation  between  the  x components  of  the  wind  at  levels 
*1J 

1 and  J.  In  the  same  manner,  the  equation  for  the  variance  of  the  y-component 
of  the  ballistic  wind  cem  be  derived,  thus 


k k 

4,  - I I ”1"J  -,1, 


c.  Although  It  Is  possible  to  compute  the  ballistic  wind,  the  mean  re- 
sultant ballistic  wind.  Its  variance  and  variances  of  the  components,  there  Is 
the  question  of  what  to  do  about  the  correlation  r between  the  components 

of  the  ballistic  wind.  The  correlation  coefficient  and  the  variances  of  the 
components  are  necessary  for  the  determination  of  whether  the  distribution  Is 
circular  or  elliptical. 

d.  Of  course.  If  the  variances  are  equal,  the  assimiptlon  of  a circular 
normal  distribution  is  a natural  and  probably  the  most  practical  assvnnptlon. 
Even  though  the  variances  of  the  components  are  not  equal,  the  analyst  may 
assume  that  the  circular  normal  distribution  Is  a good  approximation  and  com- 
pute the  vector  standard  deviation  from  Equation  (137).  The  assvnnptlon  of 
circularity  applies  only  to  the  ballistic  wind  and  not  to  the  distribution  of 
winds  at  the  various  levels.  However,  assuming  that  the  distributions  at  each 
level  are  circular  normal.  It  would  follow  that  the  ballistic  wind  distribu- 
tion is  circular  normal,  since  the  ballistic  wind  Is  a linear  combination  of 
the  winds  at  the  various  levels.  If  the  wind  at  level  1 had  a circular  normal 
distribution. 


(142) 


= 3, 


where  s Is  the  vector  standard  deviation  at  level  1,  and 


2 2 2 
3,.  = s„  + s. 


'b  •'b 
k k 


k k 


s.  ^ = ) /W.w.  r„  3^s„  + } )w,w,r  s s 

1=1  j=i  ij  1 j ij  1 j 


Prom  Equation  (142),  this  may  be  rewritten  as 


5-15 


I 

I 


AWSP  105-2 


November  1968 


I I ^ 

1=1  J=1 


+ r.. 


'11 


and,  since  circularity  Is  assumed,  the  analyst  can  use  Equation  (130)  for 
Vi  ues  of  stretch  correlation  coefficients  between  levels  as  an  approximation 
of  the  vector  correlation  coefficient.  Using  Equation  (130)  values. 


k k 

(143)  ^ ^ w^wj  s^  s^ 

1=1  J=1  ^ ^ 

One  thing  that  makes  this  equation  so  useful  Is  that  Kochanskl  [30]  has  In- 
vestigated the  geographical  distribution  of  the  stretch  vector  correlation  co- 
efficient and  has  developed  a procedure  for  estimating  Its  value  for  most  In- 
terlevel  combinations  (l.e.,  pairs  of  levels)  over  the  Northern  Hemisphere. 
Interlevel  correlations  for  components  are  available  for  only  limited  areas  In 
the  Northern  Hemisphere  [14]. 

e.  The  assumption  that  the  correlation  coefficient,  r , Is  zero  does 

not  prevent  the  analyst  from  using  the  techniques  applicable  to  an  elliptical 
distribution.  This  assumption  merely  Implies  that  the  major  axis  of  the  el- 
lipse Is  parallel  to  one  of  the  co- 
ordinate axes.  In  Figure  21,  the 
major  axis  Is  parallel  to  the  x-axls 
and  the  variance  of  the  x-component 
Is  larger  than  the  variance  of  the 
y-component.  In  this  situation  and 
the  others  that  follow,  all  equations 
given  previously  for  the  elliptical 
distribution  apply. 

f.  Prom  a study  of  these  data. 

It  might  be  more  appropriate  to  as- 
svnne  that  the  major  axis  of  the  el- 
lipse lies  along  the  direction  of  the 
mean  resultant  ballistic  wind.  This 
assumption  Implies  the  assumption  of 
a value  for  the  correlation  coeffi- 
cient for  the  relation  between  x and  y. 
mean,  then 

Y = bx  = r •—  X = (Tan  f)  x 

*b^b  “x.. 

b 


Figure  21.  Elliptical  Nonnal  Distri- 
bution when  the  Correla- 
tion Coefficient  Equals 
Zero. 


Since  they  are  departures  from  the 
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where  f Is  the  slope  of  the  major  axis 
of  the  ellipse  or,  in  other  words,  the 
angle  that  It  makes  with  the  x-axis 
(see  Figure  22).  Prom  this,  it  fol- 
lows that 


(l'^4)  r =^TanV' 


Of  course,  this  technique  is  general 
and  may  be  applied  to  any  other  slope 

of  the  major  axis  of  the  ellipse  that  22.  Elliptical  Normal  Dlstrl- 

one  may  wish  to  assime . button  when  the  Correla- 

tion Coefficient  is  not 

g.  Since  the  ballistic  wind  dls-  Equal  to  Zero, 

trlbution  Is  a combination  of  the 

wind  distributions  at  a number  of  levels,  the  analyst  might  assume  that  orien- 
tations of  the  ellipses  at  these  levels  affect  the  orientation  of  the  ellipses 
of  the  ballistic  wind  distribution.  Carrying  this  a little  further,  he  might 
assume  that  the  correlation  between  the  components  of  the  ballistic  wind  Is 
the  average  of  the  correlations  between  the  components  at  the  various  levels. 


(145) 


K. 


where  r. 


Is  the  correlation  between  the  components  at  level  1.  Then,  the 


analyst  merely  applies  the  various  equations  of  the  elliptical  normal  distri- 
bution. 

h.  There  are  undoubtedly  other  methods  for  estimating  the  correlation  be- 
tween the  components  of  the  ballistic  wind  and  determining  the  better  assump- 
tion, circularity  or  elllptlclty.  This  discussion  Is  merely  an  Introduction 
to  the  possible  methods  of  handling  the  distribution  of  ballistic  winds. 
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1 . Introduction. 

a.  In  the  solution  of  spoiled  climatological  problems,  techniques  vary 
with  the  degree  of  complexity.  However,  three  major  elements  enter  Into  every 
problem; 

(1)  Climate,  composed  of  the  various  weather  parameters. 

(2)  Space,  cor.slstlng  of  the  surface  and  upper  layers  of  the  atmos- 
phere . 

(3)  Time,  comprising  the  series  and  sequence  of  weather  observations. 

b.  Each  of  these  elements  can  appear  In  the  problem  either  as  a simple  or 
a complex  element.  In  other  words,  the  climatic  factor  can  Involve  one  weath- 
er parameter  or  many.  Space  can  mean  a single  point,  several  points,  or  an 
area.  Time  may  enter  only  In  the  restricted  sense  that  observations  cover  an 
Interval  of  time,  or  the  problem  may  pose  specific  time  limitations  In  terms 
of  cumulative  effects  or  conditioning  of  subsequent  events  by  preceding  situa- 
tions [31]* 

c.  Selected  USAF  ETAC  techniques  In  applied  climatology  are  Included  In 
this  chapter  as  examples  of  those  used  to  solve  the  meteorological  portions  of 
varied  operational,  engineering,  and  design  problems  of  military  planners. 

Many  other  techniques,  methods,  and  procedures  developed  at  USAF  ETAC  can  be 
found  In  the  Climatic  Methods  File  (CMF)  [9]  and  various  AWS  Technical  Reports 
prepared  at  USAF  ETAC. 

2.  Precipitation,  Soil  Moisture,  and  Tractlonablllty  Charts. 

a.  The  actual  amounts  of  precipitation  and  the  effects  thereof  have  either 
direct  or  Indirect  application  to  most  military  operations,  plans,  and  Intel- 
ligence. One  of  the  more  Important  aspects  concerns  knowledge  of  the  moisture 
content  of  the  soil.  This  moisture  content,  directly  affecting  the  trafflca- 
blllty  of  land  areas.  Is  of  extreme  Importance.  The  amount  of  soil  moisture 
and  the  condition  of  the  soil  surface  cannot  be  determined  adequately  by  a 
knowledge  of  only  the  amount  of  water  added  to  the  soil  by  precipitation. 

Among  other  factors  that  must  be  considered  are  water  loss  by  evaporation  and 
plant  use  (evapotransplratlon) , drainage  In  and  out  of  the  area  of  concern, 
and  soil  types.  All  of  these  variable  parameters  must  be  considered  by  the 
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meteorologist  If  he  Is  to  make  a reasonable  estimate  of  soll-molsture  content. 
The  following  paragraphs  describe  the  technique  used  at  USAP  ETAC  In  dealing 
with  soll-molsture  content  and  tractlonablllty. 

b.  A method  credited  to  Charles  W.  Thornthwalte,  with  modifications  and 
assumptions  described  herein.  Is  used  for  soil  moisture  and  tractlonablllty 
calculations.  Precipitation,  evapotransplratlon,  and  soll-molsture  amounts 
are  given  In  units  of  millimeters  of  water.  Changes  In  soil  moisture  (ASM) 
during  a period  depend  on  the  following  parameters: 

(1)  Mean  temperature  T = g — 

(2)  Hours  of  daylight  [used  to  determine  potential  evapotransplra- 
tlon (pe)]. 

(3)  Total  precipitation  (P). 

(4)  Ratio  of  the  previous  soil  moisture  (at  start  of  period)  to  the 
soil  moisture  at  field  capacity  (previous  SM/200). 

c.  Ten-day  periods  (decades)  are  used  for  soil  moisture  and  tractlonabll- 
lty calculations.  For  USAP  ETAC's  wide  areas  of  Interest,  200  mm  of  water  In 
the  too  meter  of  soil  are  assumed  to  provide  a full  soil  bank  or  to  bring  soil 
moisture  to  field  capacity.  Therefore,  when  calculating  soil  moisture,  amounts 
In  excess  of  200  mm  are  considered  as  surplus,  or  run-off.  In  contrast,  how- 
ever, when  calculating  tractlonablllty,  amounts  In  excess  of  200  mm  are  In- 
cluded to  obtain  classifications  4 (very  moist)  and  5 (wet). 

d.  Dally  maximum  and  minimum  temperatures,  from  which  10-day  and  monthly 
mean  temperatures  art  computed,  and  total  dally  precipitation  amounts  are  the 
only  meteorological  variables  that  are  used  In  the  computation  method.  It  Is 
Important  to  note  that  these  values  are  reported  only  In  the  special  phenom- 
ena groups  (Code  7 groups  for  Europe  and  most  of  Asia,  and  Code  2 and  3 groups 
for  Southeast  Asia). 

e.  The  computational  method  used  at  USAP  ETAC  Is  basically  a bookkeeping 
procedure.  The  Initial  or  starting  condition  of  soil  moisture  for  each  se- 
lected grid  point  Is  determined,  then  the  change  In  soli  moisture  due  to  sub- 
sequent precipitation,  evaporation,  and  plant  use  (evapotransplratlon)  Is 
added  to  the  previous  soll-molsture  value  to  obtain  the  new  grid-point  value 
of  soil  moisture  at  the  end  of  the  period. 

(1)  All  totals  of  precipitation,  means  of  temperature,  and  soll- 
molsture  computations  are  made  for  fixed  grid  points.  Values  of  these  param- 
eters must  be  available  for  all  computation  points  each  day  for  use  In  the 
decadal  (lO-day)  computation.  Individual  station  reports  are  received  too  Ir- 
regularly to  permit  even  selected  stations  to  be  used  for  computation  points. 
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Nevertheless,  all  available  data  are  used  each  day.  Analysis  of  each  param- 
eter la  made  by  the  continuous  weighted-mean  technique.  Values  are  assigned 
from  the  technique  to  provide  the  necessary  dally  grid-point  values  from  which 
soll-molsture  values  are  computed.  A rectangular  grid,  with  grid  points 
spaced  approximately  60  nm  apart  (l/3  GWC  grid  spacing).  Is  used. 

(2)  When  Insufficient  Input  data  are  available  to  satisfy  the  analy- 
sis program  for  a grid  point,  a missing  Indicator  Is  printed  In  lieu  of  a 
value.  The  analyst  has  the  option  of  Inserting  estimated  dally  values  of 
missing  parameters  If  he  desires.  If  reasonable  estimates  for  missing  dally 
values  are  not  possible,  the  USAF  ETAC  computer  program  (soil  moisture)  In- 
serts dally  values  based  on  long-term  means  when  decade  computations  are  made. 
Such  missing  dally  grid-point  values  are  rare  In  Europe  but  quite  common  In 
parts  of  Asia. 

(3)  The  dally  analysis  of  total  precipitation  for  grid  points  Is  the 
most  critical  portion  of  the  program,  since  the  precipitation  field  Is  gener- 
ally represented  by  a discontinuous  type  function.  Monthly  values  of  total 
precipitation  are  extracted  from  analyses  for  selected  CLIMAT  stations  and 
verified  against  monthly  totals  reported  by  these  stations.  Verification  re- 
sults are  generally  good. 

f.  Changes  in  soil  moisture  during  the  decade  are  calculated  by  the  fol- 
lowing method: 

(1)  The  potential  evapotransplration  (PE)  is  calculated  for  each  grid 
point  by  determining  the  yearly  heat  Index  (I)  from  Thornthwalte  and  Mather's 
Table  1.1  [48].  To  do  this,  enter  the  table  with  the  climatic  monthly  mean 
temperature  for  the  particular  grid  point  and  extract  the  monthly  heat  Index 
(1)  for  each  month.  The  yearly  heat  index  Is  the  total  of  the  12  monthly  1 
values.  This  procedure  needs  to  be  done  only  once,  as  each  location  or  grid 
point  and  tables  of  I are  in  the  prepared  computer  program.  Next,  determine 
the  unadjusted  PE  dally  value  for  each  point  from  Figure  1.1  [48]  by  entering 
the  figure  with  the  appropriate  mean  10-day  temperature  and  the  appropriate  I 
value.  Now,  divide  the  unadjusted  PE  value  by  28,  30,  or  31,  as  applicable, 
to  obtain  the  dally  value.  To  obtain  the  adjusted  PE,  use  Table  1.2,  page  98 
of  the  same  reference.  Enter  the  table  with  the  appropriate  latitude  and 
month  to  extract  the  mean  possible  monthly  duration  of  sunlight.  Multiply  the 
unadjusted  PE  value  by  this  sunlight  value  to  obtain  an  adjusted  dally  PE 
value.  Multiply  the  adjusted  dally  PE  value  by  ten  to  obtain  the  10-day  value. 
Note  that  the  10-day  mean  temperature  Is  the  only  variable  parameter  Involved 
In  the  computation  of  PE. 

(2)  P minus  PE  Is  then  calculated  for  each  grid  point. 

(3)  Current  soli  moisture,  as  of  the  end  of  the  decade.  Is  detemlned 
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by  adding  the  change  In  soil  moisture  (ASM)  to  the  soil  moisture  at  the  end  of 
the  previous  decade.  The  method  of  calculating  ASM  depends  upon  the  sign  of 
P minus  PE,  thus.  If  P minus  PE  Is  positive,  then 

ASM  = P minus  PE 


If  P minus  PE  Is  negative,  then 


t 


ASM  = (P  minus  PE) 


„ Previous  SM 
200 


(4)  For  soll-molsture  calculations,  values  In  excess  of  200  mm  of 
water  per  top  meter  of  soil  are  dlscai*ded  at  this  point.  However,  tractlona- 
blllty  class  Information  Is  determined  directly  from  soll-molsture  Infomatlon 
before  the  excess  of  200  mm  of  water  per  top  meter  of  soil  Is  discarded.  De- 
scriptions of  tractlonablllty  classes,  which  appear  on  pages  20  and  21  of  Air 
Force  Surveys  In  Geophysics  No.  94  [50],  apply  to  Table  28,  except  that  a soli 
depth  of  one  meter  Is  used  Instead  of  the  two  feet  noted  In  referenced  de- 
scriptions. Table  28  gives  the  tractlonablllty  classes  that  appear,  as  num- 
bers, on  tractlonablllty  charts. 


TABLE  28 


Tractlonablllty  Classes. 


No.  on 

Class 

Average  Soli  Moisture 
In  One-Meter  Depth 
(5^  of  Field  Capacity) 

Tractlonablllty 

Charts 

Plastic  Soils 

Sandy  Soils 

0 

Frozen 

surface 

Average  mean  temp, 
was  less  than  0"C 

Improved  If  m 
changed  If  dr 

olst,  little 
y 

1 

Very 

dry 

Less  than  335^ 

Good 

Poor  to 
very  poor 

2 

Dry 

33-75^ 

Good 

Poor 

3 

Moist 

75-115^ 

Deteriorates 
rapidly  In 
this  range 

Good 

4 

Very 

moist 

115-1555^ 

Poor 

Excellent 

5 

1 

Wet 

155-2005^ 

Nearly 

Impossible  I 

Fair 

I g.  In  the  Interpretation  and  use  of  tractlonablllty  charts,  there  are 

[ three  Important  considerations: 
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(1)  As  previously  stated,  grid  values  must  be  considered  as  average 
values  over  an  area;  the  averaging  process  depends  on  how  much  data  are  avail- 
able In  the  vicinity  of  each  particular  grid  point. 


(2)  Since  soil  moisture  and  tractlonablllty  calculations  are  made  at 
the  end  of  every  decade  (the  last  day  of  the  month  being  the  end  of  the  third 
decade  for  the  month),  an  assumption  01'  an  even  distribution  of  precipitation 
throughout  the  decade  Is  Inherent  In  the  computations.  Thus,  If  most  precipi- 
tation falls  early  In  the  decade,  actual  soil  moisture  will  be  somewhat  less 
and  the  tractlonablllty  class  number  somewhat  lower  than  that  calculated.  The 
converse  Is  true  If  most  precipitation  falls  late  In  the  decade.  Soil  moisture 
and  tractlonablllty  calculations  are  valid  for  the  last  day  of  the  decade. 

(3)  Terrain  effects  on  surface  mjn-off  and  soll-molsture  retention 
are  not  considered.  Variations  of  the  field  capacity  due  to  different  soil 
types  and  root  zones  are  not  considered.  Our  assumption  of  a field  capacity 
of  200 'mm  of  water  for  the  top  meter  layer  of  soil  Is  reasonable  for  a clay 
loam  soil  with  a grain  crop;  however,  heavy  clay  soils  hold  more  and  sandy 
soils  hold  less  water. 


3.  Climatological  Wind  Factors. 

a.  Climatological  wind  Information  Is  used  In  planning  air  operations 
when  the  planning  period  exceeds  the  time  period  for  which  the  forecasting 
agency  can  supply  reliable  flight-level  wind  forecasts.  In  air  route  planning, 
the  analyst  Is  Interested  In  the  extent  to  which  the  wind  either  aids  or  re- 
tards the  aircraft. 

b.  A wind  factor  can  be  defined  as  the  difference  between  the  ground 
speed  of  an  aircraft  and  Its  time  air  speed.  This  relationship  Is  given  In 
Equation  (146). 


(146)  w = I G I - I A I 

and  Is  shown  graphically  In  Figure 
23,  where  0 Is  the  ground  speed  of 
aircraft,  A the  airspeed  of  aircraft 
V the  velocity  of  wind,  w the  wind 
factor,  u the  wind  component  along 
aircraft  track,  and  v the  crosswind 
component.  The  ground  speed  of  the 
aircraft  Is  given  by 

0 « A Cos  a + u 

» A(1  - Sln^  a)^  + u 


Figure  23.  Relationship  Between 

Ground  Speed,  Airspeed, 
Wind  Velocity,  and  Wind 
Factor. 
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and,  since  v « A Sin  a. 


G . A (l  - ^ A u 


Prom  the  binomial  formula. 


1 1 , . 


1 fv- 


Neglecting  4th  powers  and  greater  of  v/A  gives 

.2 


0 


; , 1 V 

1 - ) =1-77 


7^; 


so 


Therefore 


w = G — A = A ^1  — — + u A 

and  the  wind  factor  for  a route  Is  given  by 
(147) 


w = u - 


c.  The  wind  factor  over  a route  = [w]  = harmonic  mean  of  w over  all  the 
legs  of  a route. 


(148) 


Harmonic  mean  = 


Equation  (l48)  Is  appropriate  because  the  route  wind  factor  Is  computed  over 
equal  (approximately  300  ml)  legs  with  variable  time.  Since  the  error  In  us- 
ing the  arithmetic  mean  Is  small  for  wind  speeds  s 1/3  A,  arithmetic  means  are 
used. 

d.  The  true  climatological  distribution  of  the  wind  factor  (w)  Is  found 
by  computing  a w each  day  for  the  route  over  an  adequate  period  of  record; 
this  procedure  Is  laborious,  even  by  machine.  If  winds  aloft  are  assumed  cir- 
cular normal,  then  the  mean  of  w for  a route  may  be  estimated  from  (result- 
ant wind  speed)  and  a (standard  vector  deviation  of  wind  velocity).  The  re- 
lationship between  ground  speed,  airspeed,  resultant  wind  direction,  and 
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Figure  24.  Relationship  Between  Ground  Speed,  Airspeed,  Resultant  Wind 
Speed,  and  Standard  Vector  Deviation  of  Wind  Velocity. 


From  Equation  (147),  w = u - vv2A,  and  over  a long  period  of  time  w 

5 _ "TT 1: — 5 

u - V /2A  and  then  = u - v /2A,  where  v = [v  + (v  - v)]  « 


v*^  + 2 V (v  - v)  + (v  - v)*^  and  (v  - v)  =0  and  (v  - v) 

..4  1 n-M  .44  «4...4  V...4-4  _ t ^ I rr\\^  „ ..2 


Assuming  a 


O o Q O 

circular  normal  distribution,  a = (a//2)  , so  v = v + a /2  so  w - 
u - 1/2A  (v  + a /2).  This  Is  a time  (monthly)  mean  and,  taking  the  equlva- 
tallwlnd  for  the  route  as  the  mean  or  average  of  all  legs,  gives  empirically 
[w]  = [w].  Therefore,  the  mean  or  climatological  wind  factor  for  a route  la 
given  by 


(149) 


fw]  = [u]  - if  ([v^]  + 


In  Equation  (149)  the  bar  above  the  symbol  Indicates  an  average  with  time 
(e.g.,  monthly  mean  for  a leg)  and  the  [ ] Indicates  an  average  over  all  the 
legs.  The  first  term  Is  the  tailwind  component  and  the  second  term  Is  the 
effect  of  the  croaswlnd. 

e.  The  standard  deviation,  a^,  of  the  equivalent  headwind  over  the  route 
as  a whole  la  given  by 


(150)  ^1  + ^ ^ 1 y'  J R(x)  dxds 


where  S Is  the  route  length,  s a segment  of  total  route,  A the  airspeed,  v the 
crosswind  component,  and  R(X)  the  correlation  between  effective  tailwinds  x 
miles  apart  along  the  route.  The  second  and  third  terms  In  the  parentheses 
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are  usually  negligible  and  the  above  equation  reduces  to 


(151) 


w 


a2 


/ I 

o o 


dxds 


where  o Is  assumed  to  be  constant  along  the  route,  but  Is  applicable  to  the 

p p 

problem  when  o varies  along  the  route.  So  o Is  replaced  by  [a]  and  thlt 
gives  Equation  (152),  which  forms  the  basis  of  calculation  of  the  variability 
of  equivalent  headwinds. 


(152) 


dxds 


k 


Equation  (152)  has  been  evaluated  by 
Sawyer  [42]  from  values  of  the  corre- 
lation coefficient,  R(X),  determined 
from  two  years ' measurements  of  the 
geostrophlc  wind  on  the  500-mb  sur- 
face over  the  North  Atlantic.  These 
values  of  the  k-factor  are  tabulated 
In  Table  29. 

f.  Percentiles  of  the  wind  fac- 
tor may  be  found  from 

(153)  = [W]  ± Z 

where  Z Is  the  standardized  cumula- 
tive normal  deviate.  Percentiles  are 
used  when  risk  Is  Involved;  105^  risk, 
called  "905^  worst  wind  factor,"  la 
most  commonly  used.  The  90%  worst 
wind  factor  Is 

(154)  = [W]  - 1.28 


TABLE  29 


Factor  to  Convert  Mean  Standard  Vector 
Deviation  of  Winds  Over  a Route  (a) 
to  Standard  Deviation  of  the 
Wind  Factor  (o  ). 


Route  Length 
(nm) 

Factor  (k  = 

0 

.71 

200 

.69 

400 

.67 

600 

.65 

800 

.62 

1000 

.60 

1200 

.58 

1400 

.56 

1600 

.53 

1800 

.51 

2000 

.49 

2200 

.47 

2400 

.46 

2600 

.45 

2800 

.43 

3000 

.42 

3200 

.41 

3400 

.40 

3600 

.39 

3800 

.38 

4000 

.37 

Example  45;  What  Is  the  mean  and 
905^  worst  wind  factor  at  15,000 
feet  for  a flight  from  Pittsburgh 
to  Miami  during  November?  Air- 
speed of  aircraft:  250  K. 


i 


] 

I 
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Step  1.  Divide  the  track  Into  legs  (equal  legs.  If  possible). 
Route:  Pittsburgh,  Pennsylvania  to  Mljunl,  Florida. 
Airspeed:  250  K. 

Track:  l8o“ 

Length;  Approximately  1000  miles. 

Legs:  1.  Pittsburgh  to  37*. 

2.  37*N  to  33*N. 

3.  33*N  to  29*N. 

4.  29*N  to  Miami. 


Step  2.  Determine  V_  and  a for  each  leg  from  mean  charts,  SAC 

r O 

Manual,  stmimarles,  or  other  source,  and  [ o]  and  [n  ],  where  [ ] 
Indicates  an  average  for  all  the  legs.  The  following  values 
are  obtained; 


!r 

a 

a2 

Leg  1. 

265/50 

34 

1156 

Leg  2. 

268/48 

32 

1024 

Leg  3. 

268/40 

28 

784 

Leg  4. 

230/30 

24 

576 

3540 


then  calculating 

= NoT6f  Tegs 


1 


1 


and 

[o]  = = tjse^  = 29.7 

Step  3.  Determine  u (the  component  of  V_  along  the  track,  posl- 

__  r o 

tlve  If  helping  wind),  v (the  crosswind  component  of  V^.),  and  v . 

Since 

u = Cos  9 and  v = Sin  9 
r r 

where  9 Is  the  angle  between  the  track  and  the  mean  wind,  the 
following  data  are  obtained: 
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e 

Cos  9 

1 Cl 

Sin  9 

V 

72 

Leg  1. 

265/50 

85 

.087 

-4.4 

.996 

49.8 

2480 

Leg  2. 

268/48 

88 

.035 

-1.7 

.999 

48.0 

2304 

I«g  3. 

268/40 

88 

.035 

-1.4 

.999 

40.0 

1600 

Leg  4. 

270/30 

90 

.000 

0.0 

1.000 

30.0 

900 

-7.5  7284 


—2 

Step  4.  Determine  [u]  and  [v  ] where 


tu] 


£ u 

No.  of  legs 


-1.9 


and 


[v2] 


£ 


No.  of  legs 


Step  5. 


Determine  the 


mean  wind  factor  W from  Equation  (149): 


W = [w]  = [u]  - -Ijf  ([v^]  + -L^l) 

= -1-9  - FxW  (1821 


= “^-9  - 

T r\  2264  T rt  h r- 

= ■^•9  - -300  = ■^•9  - ^-5 

W = -6.4  Icnots 


Step  6.  If  the  ”905^  worst,"  risk  wind  factor.  Is  desired, 

determine  the  standard  deviation  of  wind  factor  a,,  from 

w 

= ic  r a) 

where  k is  the  adjustment  factor  given  in  Table  29  for  a route 
length  of  1000  miles,  k « O.60,  therefore 


- 0.60  X 29.7  * 17.8 
Finally,  determine  the  "905^  worst"  from 


«90  - W - 1-28 

where  1.28  is  the  .90  deviate  from  the  cumulative  normal  distri- 
bution 
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WgQ  = -6.4  - 1.28  X 17.8  - -6.4  - 22.8  = -29.2  toots 

4.  Frequency  of  Occurrence  of  Nighttime  Illuminations  and  Various  Ceilings 

and  Visibilities. 

a.  This  technique  Involves  finding  the  frequency  of  the  Joint  occurrence 
of  Illumination  values  and  various  celllng/vlslblllty  categories.  The  materi- 
al used  Includes  seasonal  charts  of  nighttime  Illumination  and  seasonal  graphs 
of  celllng/vlslblllty  categories. 

b.  Curves  for  the  frequency  of  occurrence  of  nighttime  Illumination  at  4o 
and  50  degrees  north  latitude  were  prepared  with  the  aid  of  lllvnnlnatlon 
values  obtained  by  Brown  [13]  (see  Figure  30).  These  values  were  obtained 
from  more  than  1200  lllxjmlnatlon  measurements  made  by  Dayton  R.  E.  Brown  In 
the  Arctic,  Antarctic,  and  the  temperate  and  torrid  zones  of  both  hemispheres 
between  January  1943  and  May  1947.  The  curves  presented  (Figures  25  through 
28)  are  for  clear  sky  conditions  only.  No  attempt  has  been  made  to  determine 
the  amount  of  attenuation  of  nighttime  Illumination  due  to  clouds.  The  values 
of  Illumination  are  In  footcandles.  A footcandle  Is  defined  as  the  Ivimlnous 
energy  received  on  any  part  of  a surface  per  unit  of  time  when  the  surface  Is 
normal  to  and  one  foot  distant  from  a light  power  source  of  one  International 
candle.  The  Illumination  curves  In  this  report  are  restricted  to  the  nlglit- 
tlme  values  bounded  by  civil  twilight.  The  lower  limit  of  civil  twilight  oc- 
curs when  the  sun  Is  six  degrees  below  the  horizon  and  the  Illumination  Is 
3.16  X 10~^  footcandle. 

c.  The  annual  and  seasonal  celllng/vlslblllty  values  presented  In  Table 
30  were  obtained  from  Information  contained  In  USAF  ETAC  Report  4803  (Flying 
Weather  In  Central  Europe ) [ 8] . 

TABLE  30 


Celllng/Vlslblllty  Percentages  for  Central  Europe 
(All-Hours) 


Jan 

Apr 

Jul 

Oct 

Annual 

i 500  ft  and  * 3 ml 

58 

87 

88 

71 

76 

? 2000  ft  and  s 3 ml 

42 

73 

78 

59 

63 

* 2000  ft  and  « 6 ml 

26 

61 

65 

42 

48 
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d.  The  analyst  studied  the  diurnal  variation  during  the  midseason  months 
of  various  celllng/vlslblllty  categories  to  determine  a representative  station 
In  West  Oermany.  Bltburg  Air  Base  was  selected  because  percentages  for  cell- 
lng/vlslblllty categories  at  this  station  were  within  a few  percent  of  average 
values  for  the  entire  Central  European  Area.  Figure  29  contains  graphs  repre- 
senting diurnal  variations  during  the  midseason  months  of  three  celllng/vlsl- 
blllty categories  (»  500/3^  * 2000/3,  and  ^ 2000/6)  for  Bltburg.  The  ordi- 
nates for  graphs  In  Figure  29  show  ratios  of  the  percentage  frequency  for  the 
celllng/vlslblllty  categories  at  any  hour  of  the  day  to  the  midseason  months 
all-hours  percentage  frequency.  Assuming  the  diurnal  variation  at  this  sta- 
tion Is  representative  of  the  Central  Europe  region,  use.  of  Figure  29  In  com- 
bination with  Table  30  gives  estimates  of  average  percentage  frequencies  of 
occurrence  of  the  three  celllng/vlslblllty  categories  during  any  hour  or 
period  for  the  midseason  months.  For  example,  the  average  percentage  fre- 
quency of  category  * 2000/3  at  0700  hours  for  October  may  be  desired.  First, 
from  Table  30,  find  the  percentage  frequency  of  * 2000/3  for  all-hours  In 
October  (595^).  Then,  multiply  this  percentage  by  the  ratio  given  In  Figure  29 
for  category  ? 2000/3  at  0700  hours  for  October  (.6o)  to  obtain  the  required 
percentage  frequency  (35.^^)* 

e.  During  the  preparation  of  Report  4803,  the  average  all-hours  percent- 
ages were  compared  with  the  average  daylight-hours  percentages  for  a large 
number  of  celllng/vlslblllty  categories  during  all  seasons  for  a niomber  of 
stations  In  Central  Europe.  The  analysis  Indicated  that  the  difference  was 
only  a few  percent  In  most  cases,  and  not  more  than  5/^  In  any  case.  The  same 
thing  was  true  for  all-hours  compared  to  the  nlghttlme-hours . The  reason  for 
these  small  differences  Is  apparent  from  the  graphs  of  diurnal  variation  pre- 
sented In  Figure  29.  The  period  with  the  lowest  percentage  above  any  celling/ 
visibility  category  occurs  at  sunrise  or  a few  hours  after  sunrise,  and  the 
period  with  the  highest  percentages  occur  during  the  late  afternoon.  There- 
fore, since  the  periods  of  highest  and  lowest  conditions  both  occur  during 
daylight  hours,  there  Is  little  difference  between  average  percentages  for 
daylight-hours  and  all-hours.  Consequently,  percentages  for  all-hours  may  be 
used  as  representative  for  daylight-hours.  The  same  argument  holds  true  for 
nlghttlme-hours. 

f.  The  probability  of  Joint  occurrence  of  a given  Illumination  level  and 
one  of  the  celllng/vlslblllty  categories  can  be  obtained  by  multiplying  to- 
gether their  Individual  probabilities.  This  method  can  be  used  because  the 
nighttime  Illumination  and  the  occurrence  of  celling  and  visibility  are  as- 

_li 

svrnied  to  be  Independent  events.  For  example,  the  Joint  occui?rence  of  1 X 10 
fc  and  a celllng/vlslblllty  of  » 2000/3  In  January  at  4o  degrees  north  latl- 
tude  may  be  desired.  Prom  Figure  25#  the  percent  of  time  1 x 10  fc  Is 
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Figure  29.  Dlumal  Variation  In  the  Occurrence  of  Various  Celllng/Vlsl- 
blllty  Categories  — Bltburg  AB,  Germany. 
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equalled  or  exceeded  Is  6o;^.  Multiply  this  percentage  (60^)  by  the  percentage 
frequency  of  category  * 2000/3  obtained  from  Table  30  (42^).  This  gives  the 
Joint  percent  frequency  of  occurrence  (25.2^)  of  the  desired  categories.  If 
It  Is  desired  to  know  the  Joint  occurrence  of  the  same  two  categories  at  mid- 
night, use  Figure  29  to  find  the  ratio  of  the  frequency  of  occurrence  of  cate- 
gory * 2000/3  at  0000  hours  to  the  all -hours  frequency  (1.09).  Multiply  this 
ratio  (1.09)  by  the  Joint  percentage  (25.25^)  to  obtain  the  desired  percentage 
frequency  (27.55^). 

g.  Values  In  Table  30  are  Intended  to  provide  a general  description  of 
the  three  celllng/'vlslblllty  categories;  they  ai>e  limited  In  their  use.  They 
should  not  be  used  when  detailed  Information  on  celllng/vlslblllty  frequencies 
for  a station  or  small  area  Is  required.  These  frequencies  should  not  be  used 
for  areas  Immediately  adjacent  to  water,  since  the  land-sea  contrast  will 

cause  large  variations  from  the  percentages  shown.  Further,  the  meteorologl-  ! 

cal  stations  used  to  obtain  these  percentages  are  located  at  airports  or  In 
cities,  and  their  elevations  are  all  less  than  2000  feet  above  sea  level; 
therefore,  they  should  not  be  used  to  determine  representative  celllng/vlsl- 
blllty frequencies  for  regions  above  2000  feet  elevation.  A restriction  to 
the  use  of  the  Illumination  curves  Is  evident  since  the  curves  (Figure  30)  are 
for  clear  sky  conditions  only.  Also,  attenuation  of  the  nighttime  Illumina- 
tion by  clouds  was  not  Included,  since  very  little  Is  known  on  this  subject  at 
the  present  time.  The  only  major  reduction  of  Illumination  would  be  from 
heavy  storm  clouds  and  this  type  of  cloud  occurs  only  a small  percentage  of 
the  time.  If  the  Information  In  this  report  Is  used  to  compute  the  Joint  fre- 
quency of  occurrence  of  celllng/vlslblllty  and  Illumination,  It  must  be  re- 
membered that  the  percentage  obtained  may  be  slightly  high  when  clouds  are 
present.  Assuming  a reduction  factor  of  505^  In  Illumination  due  to  clouds 
(this  Is  the  value  Figure  30  gives  for  the  reduction  of  sunlight  under  average 
cloud  conditions),  we  find  for  4o*N  In  January  the  value  of  1 X 10"^  fc  has  a 
frequency  of  occurrence  of  approximately  555^  Instead  of  60%.  Therefore,  the  j 

desired  answer  would  be  23. 1?^  (or  555^  x 42?6),  Instead  of  25. 25^  as  previously 
computed.  This  indicates  that  a 505^  reduction  of  Illumination  only  changes 
the  Joint  frequency  by  about  2^6.  Further  Investigation  of  the  Illumination  ' 

curves  In  this  respect  shows  the  maximum  reduction  in  frequency  of  occurrence 
to  be  about  10%.  In  the  calculations  shown  above,  this  would  mean  a reduction  ] 

in  the  Joint  frequency  of  about  4$^.  | 

5.  Climatological  Analysis  for  Evaluation  of  Close  Ground  Support  Aircraft.  j j 

a.  The  purpose  of  this  technique  Is  to  establish  the  relative  merits  of 
high  and  low  performance  aircraft  In  close  ground  support  missions,  based  on 
their  Inability  to  operate  below  specified  mlnlmums  of  celling  and  visibility. 
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There  are  three  steps  Involved: 

(1)  Establishing  relationships  between  specified  operating  factors 
and  the  required  celllng/vlslblllty  for  a ground  support  mission.  Relation- 
ships between  operating  factors  and  weather  requirements  were  established  by 
devising  an  attack  profile  nomogram,  which  showed  these  comparisons  could  be 
made  by  classifying  the  performance  of  the  aircraft  according  to  its  airspeed 
capabilities . 

(2)  Determining  the  frequency  of  ceiling  and  visibility  values  over 
variable  terrain  for  specific  geographical  areas  of  Interest.  The  frequency 
of  celling  and  visibility  values  for  rather  large  areas  was  determined  by  in- 
tegrating the  frequency  distribution  of  terrain  with  the  frequency  distribu- 
tion of  ceilings  and  visibilities  at  known  points  in  the  area.  Thus..  is 
possible  to  determine  the  expectation  of  any  required  celling  and  vis  Allty 
minimum  for  targets  distributed  at  random  throughout  a geographical  a.’ea. 

(3)  Relating  the  celling/vlsiblllty  requirements  to  the  frequencies 
of  the  occurrence  in  the  geographical  areas.  Relating  the  required  celling/ 
visibility  criteria  for  various  speeds  of  aircraft  to  the  expected  frequency 
of  those  criteria  enabled  the  analyst  to  compare  the  aircraft  performance  on 
the  basis  of  the  celling  and  visibility  climatology  of  an  area. 

b.  A theoretical  attack  profile  was  devised  for  an  estimate  of  the  rela- 
tion between  required  ceillngs/vlslbllitles  and  different  constant  airspeeds, 
angles  of  dive,  and  time  in  the  dive.  It  was  initiated  along  these  lines: 

(1)  The  establishment  of  celling  and  visibility  as  criteria  for  de- 
termining whether  a support  aircraft  could  operate  required  a knowledge  of  the 
ceilings  and  visibilities  needed  to  perform  different  types  of  close  ground- 
support  missions.  Headquarters,  United  States  Air  Force;  tactical  units  of 
the  Air  Force;  Headquarters,  United  States  Air  Force  in  Europe;  the  Office  of 
the  Deputy  Chief  of  Staff  for  Operations,  United  States  Army  in  Europe;  and 
pilots  experienced  in  fighter-bomber  techniques  were  queried  concerning  oper- 
ating features  and  the  required  celling  and  visibility  for  various  types  of 
aircraft  performing  close  ground -support  missions. 

(2)  Hq  USAF  estimated  the  F-10^  minimum  celling  and  visibility  to  be 
1500  feet  and  3 miles  within  a cruise  speed  of  450  knots  and  a minimum  speed 
of  285  knots.  They  estimated  the  Mohawk  would  require  5OO  feet  and  1 mile 
with  a cruise  speed  of  200  knots.  Air  Force  tactical  units  estimated  6000- 
to  10,000-foot  ceilings  would  be  required  for  dive  bombing,  and  a minimum  of 
500  feet  and  3 miles  would  be  required  for  strafing  flat  terrain  (high  per- 
formance aircraft  implied).  Hq  USAF  suggested  a minimum  acceptable  flight 
visibility  of  three  miles  and  a desirable  visibility  of  five  or  more  miles  and 
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(a)  Low-level  bombing  (napalm  or  skip)  2500  feet  with  1000  feet 
as  a combat  mlnlmvim. 

(b)  Hlgh-angle  dive  bombing  — 10,000  feet. 

(o)  Rocketry  — 6000  feet. 

(d)  Strafing  — 2000  feet. 

(e)  GAM-83A  — 20,000  feet, 

(3)  Hq  USAEUR  Is  quoted  for  minimum  criteria  of  celling  and  visibil- 
ity for  low  performance  aircraft  during  field  training  exercises:  flxed-wlng 
aircraft,  free  of  physical  contact  with  clouds  and  visibility  1 mile  or 
greater;  and  rotai?y-v;lng  aircraft,  free  of  physical  contact  with  clouds  and 
visibility  1/2  mile  or  greater. 

(4)  Pilots  pointed  out  the  following  features:  time  spent  Ir  the 
actual  dive  should  be  on  the  order  of  15  seconds;  vertical  clearance  from  the 
cloud  deck  should  be  at  least  500  feet;  clearance  above  the  ground  should  be 
at  least  100  feet,  but  depends  on  speed,  angle  of  dive,  and  type  of  mission; 
and  visibility  depends  on  celling,  speed  of  the  aircraft,  and  terrain. 

(5)  Figure  31  shows  the  attack  profile.  The  support  aircraft  ap- 
proaches the  target  area  In  level  flight,  the  target  Is  located,  the  craft  Is 
put  Into  Its  attack  run,  the  attack  Is  accomplished,  and  the  aircraft  departs 
the  area.  Several  variables  can  be  put  Into  equations  that  can  be  solved  for 
the  celling  and  visibility  required  for  the  maneuver.  Variables  and  the 
symbols  representing  them  are: 

0 = angle  of  dive.  Angles  of  5*  10,  15,  20,  and  40  degrees  were  used  In 

calculations  to  cover  most  of  the  practical  range  of  dive  angles . 

A = airspeed  In  knots.  Airspeeds  are  considered  constant  during  the  at- 
tack. Speeds  ranging  from  100  through  1000  knots  were  used. 

a = acceleration  (ft/sec  ). 

o 

a^^  = acceleration  during  pullout  (ft/sec  ). 

a.2  = acceleration  (neg.)  entering  dive  (ft/sec  ). 

= required  minimum  celling  height  for  the  maneuver  (feet).  The  air- 
craft Is  assumed  to  be  Just  clear  at  the  start  of  the  attack. 

0 = ratio  of  acceleration  force  to  acceleration  of  gravity. 

h^  = terrain  clearance  In  pullout  (feet). 

hj^  = vertical  distance  from  point  of  alignment  to  start  of  recovery  (feet). 

R = radius  of  curvature  (feet). 

S = airspeed  (feet/second). 
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Figure  31.  Theoretical  Attack  Profile 
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t = time  from  point  of  alignment  to  start  of  recovery  (seconds). 

= minimum  visibility  from  start  of  attack  to  recovery  (miles). 

Development  of  the  equations  follows: 

General  equation  for  celling: 
a = 

Ah  = = Ahg  = R - R Cos  0 = (sVa)(l  - Cos  0) 

‘^r  = ^o  ^ “ *^o  + (sVa)(l  - Cos  0) 

C„  = Ah  + h,  + h„ 
n 1 r 

C^  = (sVa3^)(l  - Cos  0)  + S t Sin  0 + h^  + - Cos  0) 

Let  aj^  = Sg  = a 

Cjj  = S (t  Sin  0 + (2S/a)(l  - Cos  0)]  + h^ 

Conversion  to  common  units: 

C^  = h^  + 1.69  A (t  Sin  0 + .105  (A/G)(l  - Cos  0)] 

General  equation  for  (slant  visibility): 

'^n  * ^horizontal 

« S t + 2R  Tan  0 S t + 2(S^/a)  Tan  0 S t + (2S/a)  Tan  0 
Conversion  to  common  units: 

Vjj  « .00032  a [t  + (.105A/G)  Tan  0],  The  G term  Is  defined  by  letting 
G = 1 + (A/200),  where  A/200  Is  a ratio  allowing  for  variable  0 
limits  In  relation  to  the  speed  of  the  aircraft;  a higher  speed 
(higher  performance  aircraft)  pulls  more  0 forces.  Then  for 

A = 150  200  400  600  1000  knots 

G = 1.75  2.00  3.00  4.00  6.00 

(6)  Figures  32,  33,  and  3^  are  nomograms  to  determine  the  height  loss 
and  the  minimum  required  visibility  for  a theoretical  attack  profile  with 

t = 5»  10,  and  15  seconds,  respectively.  Terrain  clearance  Is  eliminated  to 
maintain  the  versatility  of  the  nomograms.  The  required  celling  Is  obtained 
by  adding  the  desired  terrain  clearance  (h^)  to  the  height  value  obtained  from 
the  nomogram  and.  If  applicable,  required  Initial  clearance  between  the  air- 
craft and  cloud  base . 

(7)  The  value  Is  the  horizontal  distance  required  for  the  maneuver. 
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MINIMUM  VISIBILITY  (MILES) 

Figure  32.  Attack  Profile  Nomogram,  t = 5 sec. 
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MINIMUM  VISIBILITY  (MILES) 


Figure  33.  Attack  Profile  Nomogram,  t ■ 10  sec. 
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therefore,  a mlnlmijm  visibility  requirement.  Most  likely  the  pilot  would  need 
to  see  the  target  some  time  before  he  starts  the  dive.  If  the  values  of 
given  by  the  nomogram  are  multiplied  by  a factor  of  1.5»  the  visibility  re- 
quirement has  a value  closer  to  those  suggested  by  the  Air  Force  sources  pre- 
viously queried.  The  nomograms  were  used  to  establish  ranges  of  ceilings  and 
visibilities  needed  In  climatological  analyses  of  specific  geographical  areas. 

c.  A terrain  versus  celling  and  visibility  analysis  Is  required  because 
ceilings  are  measured  from  ground  to  cloud  base  and  celling  heights  over  an 
area  depend  In  part  on  terrain  elevation. 

(1)  This  terrain-height  effect  must  be  Incorporated  In  the  analysis 
of  celllng-helght  frequencies  over  an  area.  The  procedure  Is  to  convert 
measured  celling  heights  to  the  equivalent  altitude  of  cloud  bases  above  sea 
level;  then  ceilings  above  other  points  In  the  vicinity  are  determined  by  sub- 
tracting terrain  elevations  at  these  points.  This  assumes  that  cloud  deck 
bases  are  substantially  level.  Exeunlnation  of  data  for  stations  at  different 
elevations  Indicates  the  assumption  Is  valid. 

(2)  Contour  maps  for  Europe,  Southeast  Asia,  and  Korea  were  used  to 
determine  the  frequency  distribution  of  terrain  heights  (Figure  35).  By  a 
process  of  weighting  and  Integration,  the  analyst  was  able  to  apply  celling 
and  visibility  data  from  available  weather  obsei^lng  sites  to  the  terrain- 
height  distribution  of  the  area  being  studied  to  determine  ceiling-height  fre- 
quencies over  the  area. 

(3)  Climatological  data  of  selected  stations  throughout  the  geograph- 
ical areas  of  Interest  were  analyzed.  The  theoretical  attack  profile  nomo- 
grams Indicated  an  analysis  of  ceilings  from  approximately  100  through  25,000 
feet  and  visibilities  of  1,  3»  and  6 miles  would  be  adequate  for  the  study. 
Celllng/vlslblllty  data  were  analyzed  for  annual  frequencies  and  "worst  sea- 
son" frequencies  when  one  could  be  determined.  Results  are  shown  In  Figures 
36  through  4o.  Each  of  the  figures  shows  the  frequency  of  celling  and/or 
visibility  being  less  than  the  specified  values.  The  selection  of  "worst 
season,"  In  some  cases.  Is  rather  arbitrary.  For  example.  In  Korea  the  poorer 
visibilities  occur  In  winter  while  the  frequency  of  ceilings  Is  greater  In 
summer. 

d.  The  analyst  compared  a 200- , 400-,  and  600-knot  required  speed  for  a 
hypothetical  strafing  mission.  Each  aircraft  required  a 10-degree  dive  for 

10  seconds  and  terrain  clearance  of  100  feet  to  accomplish  the  mission.  Re-  I 

ferrlng  to  Figure  31,  the  nomogram  furnished  the  C^-h^  and  values.  The 
required  celling,  C^,  Is  determined  by  adding  the  desired  terrain  clearance. 

Values  for  each  speed  are: 


i 
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Figure  36.  Annual  Percentage  Frequency  Celling  and/or  Visibility  Less  Than 
Specified  Values  — European  Area. 


Figure  37.  Percentage  Frequency  Celling  and/or  Visibility  Specified 
Values  — European  Area. 


Figure  38.  Annual  Percentage  Frequency  Celling  and/or  Visibility 
Specified  Values  — Southeast  Asia  Area. 
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A (kt ) 

(ml) 

Cn  (ft) 

V (1.5  V, 

200 

660 

.78 

760 

1.2 

400 

1360 

1.60 

1460 

2.4 

600 

2100 

2.50 

2200 

3.8 

By  referring  to  the  climatological  analysis  for  the  specific  areas,  the  ana- 
lyst compared  the  nonoperable  time  of  each  aircraft  for  the  specific  mission. 

Annual  Percent  of  Time  Nonoperable 


200  Knots 

400  Knots 

600  Knots 

European 

I45g 

295^ 

465^ 

Southeast  Aslan 

10^ 

13^ 

19^ 

Korean 

11^ 

18^ 

27?« 

"Worst  Season"  Percent 

of  Time  Nonoperable 

200  Knots 

400  Knots 

600  Knots 

European 

245« 

51^ 

70?« 

Southeast  Asia 

not  readily 

definable  for 

area  studl< 

Korean 

21^ 

31^ 

The  high-speed  requirement  (600  kts)  Increases  the  nonoperable  time  approxi- 
mately two  to  three  times  that  of  the  low-speed  requirement  (200  kts).  The 
European  area  Is  approximately  twice  as  bad  as  the  other  areas  for  ground 
support  operations  because  of  poor  ceilings  and/or  visibilities. 

e.  In  summary,  this  technique  provides  a method  for  comparing  the  rela- 
tive merits  of  aircraft  with  various  airspeed  capabilities  and  corresponding 
celling  and  visibility  In  respect  to  requirements  of  a close-support  target. 
The  attack  profile  nomogram  shows  the  relationship  between  airspeed,  angle, 
and  time  of  dive  to  the  required  celllng/vlslbllltles . Values  obtained  from 
the  nomogram  agree  quite  well  with  values  estimated  by  authoritative  sources, 
and  furnish  an  objective  means  of  Interpolating  criteria  for  different  attack 
conditions.  Regardless  of  the  reality  of  the  required  celling  and  visibility 
values  obtained  through  the  nomograms,  the  climatological  analysis  can  be  used 
to  make  the  desired  comparisons.  Certainly,  In  some  geographical  areas  or  for 
certain  types  of  weapon  delivery,  celling  and  visibility  requirements  may 
differ  from  minimal  values  used  here.  Over  variable  terrain,  such  as  moun- 
tainous areas,  higher  ceilings  and  visibilities  might  be  required  for  opera- 
tions. Special  weapons  may  require  more  or  less  time  on  target  for  dellvei^. 
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There  Is  no  implication  that  a 600-laiot  fighter  must  always  perform  a ground- 
support  mission  at  600  knots;  many  of  the  "Century  Series"  can  operate  at 
lower  speeds.  Comparison  between  200  knots  and  400  knots  shows  less  differ- 
ences in  the  example  as  presented.  As  differences  in  speeds  become  less,  dif- 
ferences in  nonoperable  time  become  insignificant. 

6.  Climatology  of  Atlantic  Tropical  Storms. 

a.  Tropical  storms  of  the  Atlantic  and  Caribbean  have  been  the  subject  of 
much  research,  primarily  aimed  at  improving  the  forecast  of  stom  movement; 
many  theoretical  and  empirical  studies  have  been  advanced  to  Improve  these 
predictions.  Techniques  (statistical,  objective,  subjective,  and  numerical) 
are  complicated  and  most  require  a considerable  amount  of  synoptic  information 
concerning  the  nature  of  the  circulation  in  and  around  the  storm  region.  Even 
though  new  methods  have  been  advanced  and  some  achievements  made  in  this  re- 
spect, no  outstanding  successful  prediction  system  has  been  developed.  When 
pertinent  information  is  not  available  for  making  a forecast,  or  when  the 
centrally  prepared  forecast  is  unavailable  to  the  forecaster,  a probabilistic 
forecast  utilizing  a climatological  approach  may  be  desirable.  In  some  cases 
a statement  of  probability  concerning  stom  movement  may  be  entirely  suffici- 
ent for  the  user. 

b.  There  are  available  many  studies  of  various  climatological  aspects  of 
tropical  stoms.  However,  few  are  designed  to  furnish  a probabilistic  fore- 
cast as  the  end  product.  Studies  by  Malone  [35]  and  Appleman  [lO]  are  par- 
ticularly interesting  because  they  utilize  the  uncertainty  of  synoptic  and 
statistical  prediction  schemes  and  treat  these  errors  as  vector  quantities. 

However,  these  studies  estimate  the  probability  of  hurricane  movement  based 
upon  the  short-period  forecast  movement. 

c.  In  i960,  McCabe  [33]  prepared  a series  of  preliminary  reports  on  the 
climatology  of  Par  East  typhoons.  In  these  reports  he  describes  the  adapta- 
tion of  the  circular  nomal  distribution  of  vector  quantities  to  the  probabil- 
ity of  a typhoon  trajectory  approaching  a target  from  a known  stom  position. 

The  results  of  "goodness  of  fit"  tests  seem  to  Justify  using  this  method,  as 
developed,  in  spite  of  the  fact  that  some  of  the  statistics  show  distributions 
of  stom  movement  vectors  that  are  elliptical  rather  than  circular. 

d.  If  the  McCabe  method  is  considered  applicable  for  use  with  Atlantic 
hurricanes,  the  following  actions  by  an  analyst  are  necessary: 

(1)  Data  in  punched-card  fom  on  Atlantic  stom  positions,  with  esti- 
mates of  stom  intensity,  are  obtained  from  the  National  Hurricane  Research 
Project.  The  card  deck  is  composed  of  approximately  10,000  cards  covering  the 
period  1896-1959.  It  consists  of  OOOOZ  and  1200Z  stom  position  reports. 
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Figure  4l.  Resultant  Vector  Direction  and  Speed  (lets). 

(2)  Assumptions  concerning  storm  Intensity  classification  and  storm 
movement  are  made. 

(3)  The  data  are  then  entered  on  tape  and  the  resultant  vector  direc- 
tion (i/)  and  sneed  (Vp),  standard  vector  deviation  of  storm  velocity  (ov),  and 
ratio  of  the  standard  vector  deviation  of  storm  velocity  to  the  mean  vector 
speed  (ov/Vp),  among  other  parameters  of  storm  movement,  are  computed  for  each 
5-degree  square. 

(4)  Pertinent  values  are  then  plotted  on  charts  and  an  Isollne  analy- 
sis of  each  Is  prepared. 

e.  Charts  of  resultant  vector  direction  and  speed  (Figure  4l)  and  the 
ratio  (ov/Vp)  (Figure  42)  are  used  as  basic  work  charts  for  computing  compos- 
ite trajectory  frequency  plots.  One  other  basic  chart,  or  set  of  charts.  Is 
required  — the  standard  plot  of  the  ratio  as  shown  In  Chapter  4,  Sec- 

tion D of  the  CMF  [9].  The  standard  plot  must  be  adjusted  for  the  particular 
target  radius  desired  and  the  scale  of  the  base  map  being  used  (Figure  43). 
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Figure  42.  Ratio  of  Standard  Vector  Deviation  to  Mean 
Vector  Speed  (o/Vp). 


In  studies  of  Atlantic  tropical  storms,  the  assumed  storm  or  target  radius  Is 
60  nautical  miles;  In  the  Par  East  typhoon  study,  the  assumed  radius  Is  120 
nautical  miles. 

f.  The  shape  of  the  standard  plot,  as  described  In  CMP  4-F  [9],  Is  deter- 
mined by  the  value.  Although  these  plots  are  Intended  to  be  applied 

where  the  descriptive  para;neters  {Vp,ov)  are  uniform,  segments  of  the  applica- 
ble plots  can  be  combined  to  form  a plot  describing  the  probabilities  of  vec- 
tors reaching  a target  area.  This  Is  done  In  the  following  manner; 

(1)  From  the  charts  of  ^ and  Vp,  and  ov/Vp,  Indicate  the  upstream  re- 
sultant vector  (from  the  target)  and  the  changes  of  the  ratio  along  the  up- 
stream vector  (Figure  43).  This  vector  corresponds  to  the  streamline  through 
the  point  of  Interest. 

(2)  Trace  sections  of  the  applicable  standard  plots  along  the  appro- 
priate segment  of  the  resultant  vector.  Smooth  Into  a composite  frequency 
plot. 
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j 

! (3)  To  give  some  Indication  of  the  distance  a storm  may  move  In  a 

! given  time  Intcm'al,  a median  24-hour  and  48-hour  storm  movement  line  can  be 

I 

I drawn,  Tht  itscance  Is  determined  by  aoproxlmatlng  the  average  over  the 

1 aoproprlate  lostr’eam  distance.  This  distance  Is  then  an  approximation  to  the 

i median  distance  of  storm  movement. 

g.  These  composite  frequency  plots  provide  a climatological  estimate  of 
the  frequency  that  tropical  storms  and  hurricanes  (at  random  by  reason  of 
their  present  location)  subsequently  come  within  60  nautical  miles  of  the  base 
location  (Figure  44).  That  Is,  If  a storm  Is  located  somewhere  between  the 
105^  and  20^  Isollnes, the  climatological  probability  of  that  storm  coming  with- 
in 60  nautical  miles  of  the  base  Is  between  .10  and  .20;  If  the  storm  Is  lo- 
cated In  the  area  delineated  by  the  50$^  and  605^  Isollnes,  the  storm  has  a 505^ 
to  605C  chance  of  reaching  the  base  area.  Since  plots  are  graphical  descrip- 
tions of  the  normal  behavior  of  storms  during  a period  of  time,  a storm  that 
Is  moving  In  opposition  to  the  standard  vector  Is  probably  less  a threat  than 
one  following  the  normal  track;  however,  no  attempt  has  been  made  to  assess 
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Figure  44.  Composite  Frequency  Plot. 


the  short-period  trajectory  [?]. 

h.  Tests  of  composite  frequency  plots  with  the  storm  tracks  show  that  the 
plots  are  reasonable;  however,  a certain  amount  of  subjectivity  was  Inherent 
In  the  process  of  making  a decision  as  to  whether  a storm  was  In  or  out  of  a 
particular  zone.  Since  there  are  few  stems  to  consider,  a different  subjec- 


tlve  evaluation  may 

alter  the  results 

In  the 

high  probability 

classes 

105^.  The  following 

are  the 

results  of 

this 

"goodness 

of  fit" 

test  on 

prepared  for  Eglln  and  Patrick  Air  Force  Bases: 
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Sep 
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18 
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27 

36 

20 

19 
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35 
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42 

52 

33 

42 

39 

47 
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73 

60 

45 

77 

70 

69 

> 70^ 

80 

71 

80 

88 
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82 
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The  test  consisted  of  computing  the  percent  of  storms  entering  the  Indicated 
zone  and  eventually  reaching  the  target  area.  Reading  across  the  top  line, 

13^  of  the  August  storms  that  entered  the  plot  area  delineated  by  the  IQff  and 
20%  Isollnes  eventually  came  within  60  nautical  miles  of  Eglln  AFB,  175^  In 
September,  and  \<y%  In  October. 

1,  Some  possible  applications  of  the  composite  frequency  plots  are: 

(1)  In  the  field  of  Industrial  meteorology  the  probabilistic  approach 
to  business  decision-making  Is  growing  rapidly.  It  has  been  used  for  a long 
time  In  the  planning  phase  and  more  recently  In  the  operational  phase.  The 
use  of  this  climatological  estimate  of  the  threat  that  a given  storm  poses  to 
a given  location  requires  only  a position  report  and  no  knowledge  of  the  com- 
plex atmosphere  In  which  It  is  embedded. 

(2)  Consider  the  simplified  case  of  an  entrepreneur  In  the  hurricane 
region.  Aware  of  the  presence  of  a hurricane  In  the  area,  he  is  faced  with  •' 
the  problem  of  deciding  whether  to  close  his  place  of  business  temporarily  and 
move  his  stock  to  a more  protected  location.  Assume  that  the  owner  has  some 
measure  of  the  cost  Involved  in  taking  protective  action  that  does  not  involve 
movement  of  the  stock  or  assets,  and  some  measure  of  the  cost  of  the  alterna- 
tive decision  if  the  storm  destroys  his  place  of  business.  As  suggested  by 
Thompson  and  Brier  f46],  the  following  simple  ratio  may  be  a guide  for  deci- 
sion; 


In  this  case,  P Is  the  probability  of  the  storm  coming  within  some  prescribed 
distance,  C Is  the  total  cost  of  taking  protective  action,  and  L Is  the  po- 
tential loss.  The  decision  criterion  Indicates  that  protective  action  should 
be  taken  when  the  probability  that  the  hurricane  will  come  Into  the  area  of 
Interest  Is  as  great  as  or  greater  than  the  cost-loss  ratio.  Conversely,  if 
the  probability  Is  less  than  the  cost-loss  ratio,  no  protective  action  need  be 
taken.  The  economic  factors  Involved  in  taking  protective  measures,  estimat- 
ing the  uninsured  loss,  and  insuring  against  prohibitive  loss  are  complex,  and 
a considerable  amount  of  money  Is  Involved.  In  light  of  this.  It  may  be  of 
mutual  Interest  for  Insurance  companies  to  require  protective  measures  at  some 
level  of  storm  threat.  Such  Insurance  may  benefit  both  parties  through  re- 
duced rates  to  the  Insured  and  a smaller  loss  of  Insured  goods  to  the  Insur- 
ance company. 
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